Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dangling markup injection mitigation #10022

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
147 changes: 103 additions & 44 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -7071,10 +7071,13 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
it represents. While this process is defined in <cite>URL</cite>, the HTML standard defines
several wrappers to abstract base URLs and encodings. <ref>URL</ref></p>

<p class="note">Most new APIs are to use <span>parse a URL</span>. Older APIs and HTML elements
might have reason to use <span data-x="encoding-parsing a URL">encoding-parse a URL</span>. When a
custom base URL is needed or no base URL is desired, the <span>URL parser</span> can of course be
used directly as well.</p>
<p class="note">Most new APIs are to use <span>parse a URL</span>, and HTML elements are to use
<span>HTML-parse a URL</span>. Older APIs might have reason to use <span
data-x="encoding-parsing a URL">encoding-parse a URL</span>. When a custom base URL is needed or
no base URL is desired, the <span>URL parser</span> can of course be used directly as well.</p>

<p class="note"><span>HTML-parse a URL</span> has an additional logic to mitigate the dangling
markup injection attacks, while <span>parse a URL</span> does not.</p>

<p>To <dfn export>parse a URL</dfn>, given a string <var>url</var>, relative to a
<code>Document</code> object or <span>environment settings object</span> <var>environment</var>,
Expand Down Expand Up @@ -7130,6 +7133,50 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
serializer</span> to <var>url</var>.</p></li>
</ol>

shhnjk marked this conversation as resolved.
Show resolved Hide resolved
<p>To <dfn>check dangling markup</dfn>, given a string <var>url</var>, run these steps:</p>
shhnjk marked this conversation as resolved.
Show resolved Hide resolved
<ol>
<li><p>Let <var>newline</var> and <var>lt</var> flag be false.</p></li>
shhnjk marked this conversation as resolved.
Show resolved Hide resolved

<li><p>For each <var>byte</var> in <var>url</var>:</p>
<ol>
<li><p>If <var>byte</var> is an <span>ASCII tab or newline</span> character, then set the
<var>newline</var> flag to true.</p></li>

<li><p>Otherwise, if <var>byte</var> is a U+003C (&lt;) character, set the <var>lt</var> flag
to true.</p></li>
</ol>
</li>

<li><p>If both <var>newline</var> and <var>lt</var> flags are true, then return true. Otherwise,
return false.</p></li>
</ol>

<p>To <dfn export>HTML-parse a URL</dfn>, given a string <var>url</var>, relative to a
<code>Document</code> object or <span>environment settings object</span> <var>environment</var>,
run these steps. They return failure or a <span>URL</span>.</p>
<ol>
<li><p>If the result of running <span>check dangling markup</span> given <var>url</var>, is true,
then return failure.</p></li>

<li><p>Return the result of applying the <span>encoding-parsing a URL</span> given
<var>url</var>, relative to <var>environment</var>.</p></li>
</ol>

<p>To <dfn data-x="HTML-encoding-parsing-and-serializing a URL">HTML-encoding-parse-and-serialize
a URL</dfn>, given a string <var>url</var>, relative to a <code>Document</code> object or
<span>environment settings object</span> <var>environment</var>, run these steps. They return
failure or a string.</p>

<ol>
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given <var>url</var>,
relative to <var>environment</var>.</p></li>

<li><p>If <var>url</var> is failure, then return failure.</p></li>

<li><p>Return the result of applying the <span data-x="concept-url-serializer">URL
serializer</span> to <var>url</var>.</p></li>
</ol>

</div>


Expand Down Expand Up @@ -8032,8 +8079,8 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
<ol>
<li><p>If <var>contentAttributeValue</var> is null, then return the empty string.</p></li>

<li><p>Let <var>urlString</var> be the result of <span>encoding-parsing-and-serializing a
URL</span> given <var>contentAttributeValue</var>, relative to <var>element</var>'s
<li><p>Let <var>urlString</var> be the result of <span>HTML-encoding-parsing-and-serializing
a URL</span> given <var>contentAttributeValue</var>, relative to <var>element</var>'s
<span>node document</span>.</p></li>

<li><p>If <var>urlString</var> is not failure, then return <var>urlString</var>.</p></li>
Expand Down Expand Up @@ -14950,13 +14997,19 @@ interface <dfn interface>HTMLBaseElement</dfn> : <span>HTMLElement</span> {
<p>To <dfn export>set the frozen base URL</dfn> for an element <var>element</var>:</p>

<ol>
<li><p>Let <var>document</var> be <var>element</var>'s <span>node document</span>.
<li><p>Let <var>document</var> be <var>element</var>'s <span>node document</span>.</p></li>

<li><p>Let <var>url</var> be the value of <var>element</var>'s <code
data-x="attr-base-href">href</code> content attribute.</p></li>

<li><p>Let <var>urlRecord</var> be the result of <span data-x="URL parser">parsing</span> the
value of <var>element</var>'s <code data-x="attr-base-href">href</code> content attribute with
<var>document</var>'s <span>fallback base URL</span>, and <var>document</var>'s <span
data-x="document's character encoding">character encoding</span>. (Thus, the <code>base</code>
element isn't affected by itself.)</p></li>
<li><p>If the result of running <span>check dangling markup</span> given <var>url</var> is true,
then let <var>urlRecord</var> be failure.</p></li>

<li><p>Otherwise, let <var>urlRecord</var> be the result of <span
data-x="URL parser">parsing</span> the value of <var>element</var>'s <code
data-x="attr-base-href">href</code> content attribute with <var>document</var>'s <span>fallback
base URL</span>, and <var>document</var>'s <span data-x="document's character encoding">character
shhnjk marked this conversation as resolved.
Show resolved Hide resolved
encoding</span>. (Thus, the <code>base</code> element isn't affected by itself.)</p></li>
<!-- This uses the URL parser rather than encoding-parse a URL since otherwise we'd have to
unnecessarily complicate the latter for two callsites. -->

Expand Down Expand Up @@ -15660,8 +15713,11 @@ interface <dfn interface>HTMLLinkElement</dfn> : <span>HTMLElement</span> {
<li><p>If <var>options</var>'s <span data-x="link options destination">destination</span> is not
a <span data-x="concept-request-destination">destination</span>, then return null.</p></li>

<li><p>If the result of running <span>check dangling markup</span> given <var>options</var>'s
<span data-x="link options href">href</span> is true, then return null.</p></li>

<li>
<p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given
<p>Let <var>url</var> be the result of <span>parse a URL</span> given
shhnjk marked this conversation as resolved.
Show resolved Hide resolved
<var>options</var>'s <span data-x="link options href">href</span>, relative to
<var>options</var>'s <span data-x="link options base url">base URL</span>.</p>

Expand Down Expand Up @@ -17169,8 +17225,11 @@ people expect to have work and what is necessary.
<span>code point</span>, so that it and all subsequent <span data-x="code point">code
points</span> are removed.</p>

<li><p><i>Parse</i>: Set <var>urlRecord</var> to the result of <span>encoding-parsing a
URL</span> given <var>urlString</var>, relative to <var>document</var>.</p></li>
<li><p>If <var>meta</var> is given, let <var>urlRecord</var> be the result of <span>HTML-parse
a URL</span> given <var>urlString</var>, relative to <var>document</var>.</p></li>

<li><p>Otherise, let <var>urlRecord</var> be the result of <span>encoding-parsing a URL</span>
given <var>urlString</var>, relative to <var>document</var>.</p></li>

<li><p>If <var>urlRecord</var> is failure, then return.</p></li>
</ol>
Expand Down Expand Up @@ -25336,7 +25395,7 @@ document.body.appendChild(wbr);</code></pre>

<li><p>If <var>targetNavigable</var> is null, then return.</p></li>

<li><p>Let <var>urlString</var> be the result of <span>encoding-parsing-and-serializing a
<li><p>Let <var>urlString</var> be the result of <span>HTML-encoding-parsing-and-serializing a
URL</span> given <var>subject</var>'s <code data-x="attr-hyperlink-href">href</code> attribute
value, relative to <var>subject</var>'s <span>node document</span>.</p></li>

Expand Down Expand Up @@ -25408,7 +25467,7 @@ document.body.appendChild(wbr);</code></pre>
sandboxing flag set</span> has the <span>sandboxed downloads browsing context flag</span> set,
then return.</p></li>

<li><p>Let <var>urlString</var> be the result of <span>encoding-parsing-and-serializing a
<li><p>Let <var>urlString</var> be the result of <span>HTML-encoding-parsing-and-serializing a
URL</span> given <var>subject</var>'s <code data-x="attr-hyperlink-href">href</code> attribute
value, relative to <var>subject</var>'s <span>node document</span>.</p></li>

Expand Down Expand Up @@ -25628,10 +25687,10 @@ document.body.appendChild(wbr);</code></pre>
<p>If a <span>hyperlink</span> created by an <code>a</code> or <code>area</code> element has a
<code data-x="attr-hyperlink-ping">ping</code> attribute, and the user follows the hyperlink, and
the value of the element's <code data-x="attr-hyperlink-href">href</code> attribute can be <span
data-x="encoding-parsing a URL">parsed</span>, relative to the element's <span>node
data-x="HTML-parse a URL">parsed</span>, relative to the element's <span>node
document</span>, without failure, then the user agent must take the <code
data-x="attr-hyperlink-ping">ping</code> attribute's value, <span data-x="split a string on ASCII
whitespace">split that string on ASCII whitespace</span>, <span data-x="encoding-parsing a
whitespace">split that string on ASCII whitespace</span>, <span data-x="HTML-parse a
URL">parse</span> each resulting token, relative to the element's <span>node document</span>, and
then run these steps for each resulting <span>URL</span> <var>ping URL</var>, ignoring when
parsing returns failure:</p>
Expand Down Expand Up @@ -26295,7 +26354,7 @@ document.body.appendChild(wbr);</code></pre>
given a <code>link</code> element <var>el</var>, are:

<ol>
<li><p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given
<var>el</var>'s <code data-x="attr-link-href">href</code> attribute's value, relative to
<var>el</var>'s <span>node document</span>.</p></li>

Expand Down Expand Up @@ -26752,7 +26811,7 @@ document.body.appendChild(wbr);</code></pre>
data-x="concept-event-fire">fire an event</span> named <code data-x="event-error">error</code>
at <var>el</var>, and return.</p></li>

<li><p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given
<var>el</var>'s <code data-x="attr-link-href">href</code> attribute's value, relative to
<var>el</var>'s <span>node document</span>.</p></li>

Expand Down Expand Up @@ -27010,7 +27069,7 @@ document.body.appendChild(wbr);</code></pre>
<span data-x="link options href">href</span> is an empty string, returns.</p></li>

<li>
<p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given
<p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given
<var>options</var>'s <span data-x="link options href">href</span>, relative to
<var>options</var>'s <span data-x="link options base URL">base URL</span>.</p>

Expand Down Expand Up @@ -30684,7 +30743,7 @@ was an English &lt;a href="/wiki/Music_hall">music hall&lt;/a> singer, ...</code
<p>If <var>selected source</var> is not null, then:</p>

<ol>
<li><p>Let <var>urlString</var> be the result of <span>encoding-parsing-and-serializing a
<li><p>Let <var>urlString</var> be the result of <span>HTML-encoding-parsing-and-serializing a
URL</span> given <var>selected source</var>, relative to the element's <span>node
document</span>.</p></li>
<!-- This does not change currentSrc -->
Expand Down Expand Up @@ -31760,9 +31819,9 @@ was an English &lt;a href="/wiki/Music_hall">music hall&lt;/a> singer, ...</code
base URL resolution, so changing the base URL doesn't trigger an update if nothing else changed
-->

<li><p>&#x231B; Let <var>urlString</var> be the result of <span>encoding-parsing-and-serializing
a URL</span> given <var>selected source</var>, relative to the element's <span>node
document</span>.</p></li>
<li><p>&#x231B; Let <var>urlString</var> be the result of
<span>HTML-encoding-parsing-and-serializing a URL</span> given <var>selected source</var>, relative
to the element's <span>node document</span>.</p></li>

<li><p>&#x231B; If <var>urlString</var> is failure, then return.</p></li>

Expand Down Expand Up @@ -33134,7 +33193,7 @@ interface <dfn interface>HTMLIFrameElement</dfn> : <span>HTMLElement</span> {
and its value is not the empty string, then:</p>

<ol>
<li><p>Let <var>maybeURL</var> be the result of <span>encoding-parsing a URL</span> given that
<li><p>Let <var>maybeURL</var> be the result of <span>HTML-parse a URL</span> given that
attribute's value, relative to <var>element</var>'s <span>node document</span>.</p></li>

<li><p>If <var>maybeURL</var> is not failure, then set <var>url</var> to
Expand Down Expand Up @@ -33812,7 +33871,7 @@ interface <dfn interface>HTMLEmbedElement</dfn> : <span>HTMLElement</span> {
<p>If <var>element</var> has a <code data-x="attr-embed-src">src</code> attribute set, then:</p>

<ol>
<li><p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given
<var>element</var>'s <code data-x="attr-embed-src">src</code> attribute's value, relative to
<var>element</var>'s <span>node document</span>.</p></li>

Expand Down Expand Up @@ -34127,7 +34186,7 @@ interface <dfn interface>HTMLObjectElement</dfn> : <span>HTMLElement</span> {
not a type that the user agent supports, then the user agent may jump to the step below labeled
<i>fallback</i> without fetching the content to examine its real type.</p></li>

<li><p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given the <code
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given the <code
data-x="attr-object-data">data</code> attribute's value, relative to the element's <span>node
document</span>.</p></li>

Expand Down Expand Up @@ -34591,7 +34650,7 @@ interface <dfn interface>HTMLVideoElement</dfn> : <span>HTMLMediaElement</span>
<li><p>If the <code data-x="attr-video-poster">poster</code> attribute's value is the empty string
or if the attribute is absent, then there is no <span>poster frame</span>; return.</p></li>

<li><p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given the <code
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given the <code
data-x="attr-video-poster">poster</code> attribute's value, relative to the element's <span>node
document</span>.</p></li>

Expand Down Expand Up @@ -35154,7 +35213,7 @@ interface <dfn interface>HTMLTrackElement</dfn> : <span>HTMLElement</span> {
value.</p></li>

<li><p>If <var>value</var> is not the empty string, then set <var>trackURL</var> to the result of
<span>encoding-parsing-and-serializing a URL</span> given <var>value</var>, relative to the
<span>HTML-encoding-parsing-and-serializing a URL</span> given <var>value</var>, relative to the
element's <span>node document</span>.</p></li>

<li><p>Set the element's <span>track URL</span> to <var>trackURL</var> if it is not failure;
Expand Down Expand Up @@ -36049,7 +36108,7 @@ interface <dfn interface>MediaError</dfn> {
attribute's value is the empty string, then end the <span>synchronous section</span>, and jump
down to the <i>failed with attribute</i> step below.</p></li>

<li><p>&#x231B; Let <var>urlRecord</var> be the result of <span>encoding-parsing a URL</span>
<li><p>&#x231B; Let <var>urlRecord</var> be the result of <span>HTML-parse a URL</span>
given the <code data-x="attr-media-src">src</code> attribute's value, relative to the
<span>media element</span>'s <span>node document</span> when the <code
data-x="attr-media-src">src</code> attribute was last changed.</p>
Expand Down Expand Up @@ -50757,7 +50816,7 @@ ldh-str = &lt; as defined in <a href="https://www.rfc-editor.org/rfc/rfc10
attribute's value is the empty string, run these steps:

<ol>
<li><p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given the <code
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given the <code
data-x="attr-input-src">src</code> attribute's value, relative to the element's <span>node
document</span>.</p></li>

Expand Down Expand Up @@ -59333,7 +59392,7 @@ fur
-->
</li>

<li><p>Let <var>parsed action</var> be the result of <span>encoding-parsing a URL</span> given
<li><p>Let <var>parsed action</var> be the result of <span>HTML-parse a URL</span> given
<var>action</var>, relative to <var>submitter</var>'s <span>node document</span>.</p></li>

<li><p>If <var>parsed action</var> is failure, then return.</p></li>
Expand Down Expand Up @@ -62333,7 +62392,7 @@ o............A....e
<li><p>Set <var>el</var>'s <span data-x="concept-script-external">from an external file</span>
to true.</p></li>

<li><p>Let <var>url</var> be the result of <span>encoding-parsing a URL</span> given
<li><p>Let <var>url</var> be the result of <span>HTML-parse a URL</span> given
<var>src</var>, relative to <var>el</var>'s <span>node document</span>.</p></li>

<li><p>If <var>url</var> is failure, then <span>queue an element task</span> on the <span>DOM
Expand Down Expand Up @@ -130997,11 +131056,11 @@ html, body { display: block; }</code></pre>

<p>When a <code>body</code> element has a <code data-x="attr-background">background</code>
attribute set to a non-empty value, the new value is expected to be <span
data-x="encoding-parsing-and-serializing a URL">encoding-parsed-and-serialized</span> relative to
the element's <span>node document</span>, and if that does not return failure, the user agent is
expected to treat the attribute as a <span data-x="presentational hints">presentational
hint</span> setting the element's <span>'background-image'</span> property to the return
value.</p>
data-x="HTML-encoding-parsing-and-serializing a URL">HTML-encoding-parsed-and-serialized</span>
relative to the element's <span>node document</span>, and if that does not return failure, the
user agent is expected to treat the attribute as a <span
data-x="presentational hints">presentational hint</span> setting the element's
<span>'background-image'</span> property to the return value.</p>

<p>When a <code>body</code> element has a <code data-x="attr-body-bgcolor">bgcolor</code>
attribute set, the new value is expected to be parsed using the <span>rules for parsing a legacy
Expand Down Expand Up @@ -131843,10 +131902,10 @@ table {
<p>When a <code>table</code>, <code>thead</code>, <code>tbody</code>, <code>tfoot</code>,
<code>tr</code>, <code>td</code>, or <code>th</code> element has a <code
data-x="attr-background">background</code> attribute set to a non-empty value, the new value is
expected to be <span data-x="encoding-parsing-and-serializing a
URL">encoding-parsed-and-serialized</span> relative to the element's <span>node document</span>,
and if that does not return failure, the user agent is expected to treat the attribute as a <span
data-x="presentational hints">presentational hint</span> setting the element's
expected to be <span data-x="HTML-encoding-parsing-and-serializing a
URL">HTML-encoding-parsed-and-serialized</span> relative to the element's <span>node
document</span>, and if that does not return failure, the user agent is expected to treat the
attribute as a <span data-x="presentational hints">presentational hint</span> setting the element's
<span>'background-image'</span> property to the return value.</p>

<p>When a <code>table</code>, <code>thead</code>, <code>tbody</code>, <code>tfoot</code>,
Expand Down