Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization special casing of style, script, etc. needs to be namespace-dependent #333

Closed
domenic opened this issue Mar 7, 2021 · 4 comments · Fixed by #383
Closed

Comments

@domenic
Copy link

domenic commented Mar 7, 2021

This escaping:

if (
parentTn === $.STYLE ||
parentTn === $.SCRIPT ||
parentTn === $.XMP ||
parentTn === $.IFRAME ||
parentTn === $.NOEMBED ||
parentTn === $.NOFRAMES ||
parentTn === $.PLAINTEXT ||
parentTn === $.NOSCRIPT
) {
this.html += content;
} else {
this.html += Serializer.escapeString(content, false);
}
is meant to mimic HTML's

If the parent of current node is a style, script, xmp, iframe, noembed, noframes, or plaintext element, or if the parent of current node is a noscript element and scripting is enabled for the node, then append the value of current node's data IDL attribute literally.

However, while HTML is referring specifically to those nodes in the HTML namespace, parse5 does not check the namespace before serializing.

This leads to the following wrong result in jsdom: https://runkit.com/domenicdenicola/6044168f463fb4001a725307 compared to browsers: http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=8981

@domenic
Copy link
Author

domenic commented Mar 7, 2021

if (
tn !== $.AREA &&
tn !== $.BASE &&
tn !== $.BASEFONT &&
tn !== $.BGSOUND &&
tn !== $.BR &&
tn !== $.COL &&
tn !== $.EMBED &&
tn !== $.FRAME &&
tn !== $.HR &&
tn !== $.IMG &&
tn !== $.INPUT &&
tn !== $.KEYGEN &&
tn !== $.LINK &&
tn !== $.META &&
tn !== $.PARAM &&
tn !== $.SOURCE &&
tn !== $.TRACK &&
tn !== $.WBR
) {
is similarly suspicious... in general any checks against $.SOMECONSTANT really should be accompanied with a namespace check.

@nfriedly
Copy link

nfriedly commented Apr 4, 2021

This just bit me today: in a <script> tag && is getting changed to &amp;&amp;, which is a syntax error.

@nfriedly
Copy link

nfriedly commented Apr 4, 2021

Actually, looking closer, $.SCRIPT is on that list, so I'm not sure if I'm hitting this or some other issue...

@nfriedly
Copy link

nfriedly commented Apr 4, 2021

Yeah, this is what's causing my issue:

emitText({ text }) {
this.push(escapeString(text, false));
}

It doesn't even try to check the namespace or parent element there...

I'll file a separate issue.

fb55 added a commit to parse5/parse5-fork that referenced this issue Jan 18, 2022
@fb55 fb55 linked a pull request Jan 18, 2022 that will close this issue
fb55 added a commit to parse5/parse5-fork that referenced this issue Feb 7, 2022
@fb55 fb55 closed this as completed in #383 Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants