Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-special URLs were not idempotent #505

Merged
merged 1 commit into from
Aug 24, 2020
Merged

Conversation

annevk
Copy link
Member

@annevk annevk commented May 9, 2020

Adjust the URL serializer so it does not output something which during the next parse operation would yield a host.

whatwg-url: jsdom/whatwg-url#148

Tests: ...

Fixes #415.

@zealousidealroll what do you think?

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
@annevk annevk requested a review from rmisev May 11, 2020 05:22
Adjust the URL serializer so it does not output something which during the next parse operation would yield a host.

whatwg-url: ...

Tests: ...

Fixes #415.
@annevk annevk force-pushed the annevk/avoid-reparse-issues branch from 9a50c7a to c3c7a82 Compare June 24, 2020 12:19
rmisev added a commit to rmisev/web-platform-tests that referenced this pull request Aug 19, 2020
@rmisev
Copy link
Member

rmisev commented Aug 20, 2020

Tests: web-platform-tests/wpt#25113

@domenic
Copy link
Member

domenic commented Aug 20, 2020

Tests reviewed; shall we land this?

rmisev added a commit to rmisev/url_whatwg that referenced this pull request Aug 20, 2020
@annevk
Copy link
Member Author

annevk commented Aug 21, 2020

@domenic is Chrome interested? I'm checking with Valentin if Mozilla is and also pinged WebKit's representative again.

@domenic
Copy link
Member

domenic commented Aug 21, 2020

I think I can confidently say that Chrome has as a long-term goal reasonable, interoperable URL parsing, and this change is part of that. So yeah, +1 from Chrome.

@TimothyGu
Copy link
Member

Added Node.js to list of implementation bugs to file, per #525.

@annevk annevk removed the needs tests Moving the issue forward requires someone to write tests label Aug 24, 2020
annevk pushed a commit to web-platform-tests/wpt that referenced this pull request Aug 24, 2020
@annevk annevk merged commit 83adf0c into master Aug 24, 2020
@annevk annevk deleted the annevk/avoid-reparse-issues branch August 24, 2020 07:09
@annevk
Copy link
Member Author

annevk commented Aug 24, 2020

Thanks all! I updated OP to include the relevant information.

domenic pushed a commit to jsdom/whatwg-url that referenced this pull request Aug 24, 2020
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Aug 27, 2020
…testonly

Automatic update from web-platform-tests
Test non-special URLs are idempotent

See whatwg/url#415 and
whatwg/url#505 for context.

--

wpt-commits: 551c9d604fb8b97d3f8c65793bb047d15baddbc2
wpt-pr: 25113
watilde added a commit to watilde/node that referenced this pull request Sep 16, 2020
watilde added a commit to nodejs/node that referenced this pull request Sep 20, 2020
Fixes: #34899
Refs: whatwg/url#505
Refs: web-platform-tests/wpt#25113

PR-URL: #34925
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
ambroff pushed a commit to ambroff/gecko that referenced this pull request Nov 4, 2020
…testonly

Automatic update from web-platform-tests
Test non-special URLs are idempotent

See whatwg/url#415 and
whatwg/url#505 for context.

--

wpt-commits: 551c9d604fb8b97d3f8c65793bb047d15baddbc2
wpt-pr: 25113
ryanhaddad pushed a commit to WebKit/WebKit that referenced this pull request Dec 22, 2020
https://bugs.webkit.org/show_bug.cgi?id=215762

Reviewed by Tim Horton.

LayoutTests/imported/w3c:

* web-platform-tests/url/a-element-expected.txt:
* web-platform-tests/url/a-element-xhtml-expected.txt:
* web-platform-tests/url/url-constructor-expected.txt:
* web-platform-tests/url/url-setters-expected.txt:

Source/WTF:

whatwg/url#505 added an interesting edge case to the URL serialization:
"If url’s host is null, url’s path’s size is greater than 1, and url’s path[0] is the empty string, then append U+002F (/) followed by U+002E (.) to output."
The problem was that URLs like "a:/a/..//a" would be parsed into "a://a" with a pathname of "//a" and an empty host.  If "a://a" was then reparsed, it would again have an href of "a://a"
but its host would be "a" and it would have an empty path.  There is consensus that URL parsing should be idempotent, so we need to do something different here.
According to whatwg/url#415 (comment) this follows what Edge did (and then subsequently abandoned when they switched to Chromium)
to make URL parsing idempotent by adding "/." before the path in the edge case of a URL with a non-special scheme (not http, https, wss, etc.) and a null host and a non-empty path that
has an empty first segment.  All the members of the URL remain unchanged except the full serialization (href).  This is not important in practice, but important in theory.

Our URL parser tries very hard to use the exact same WTF::String object given as input if it can.  However, this step is better implemented as a post-processing step that will almost never happen
because otherwise we would have to parse the entire path twice to find out if we need to add "./" or if the "./" that may have already been there needs to stay.  This is illustrated with the test URL
"t:/.//p/../../../..//x" which does need the "./".

In the common case, this adds one well-predicted branch to URL parsing, so I expect performance to be unaffected.  Since this is such a rare edge case of URLs, I expect no compatibility problems.

* wtf/URL.cpp:
(WTF::URL::pathStart const):
* wtf/URL.h:
(WTF::URL::pathStart const): Deleted.
* wtf/URLParser.cpp:
(WTF::URLParser::copyURLPartsUntil):
(WTF::URLParser::URLParser):
(WTF::URLParser::needsNonSpecialDotSlash const):
(WTF::URLParser::addNonSpecialDotSlash):
* wtf/URLParser.h:

Tools:

* TestWebKitAPI/Tests/WTF/URLParser.cpp:
(TestWebKitAPI::TEST_F):



Canonical link: https://commits.webkit.org/229956@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@267837 268f45cc-cd09-0410-ab3c-d52691b4dbfc
joesepi pushed a commit to joesepi/node that referenced this pull request Jan 8, 2021
Fixes: nodejs#34899
Refs: whatwg/url#505
Refs: web-platform-tests/wpt#25113

PR-URL: nodejs#34925
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Reparsing problem with non-special URL and double-dot path component
4 participants