Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing escaped attributes shouldn't set raw property in resulting node #105

Open
larsrh opened this issue Jan 6, 2020 · 1 comment
Open

Comments

@larsrh
Copy link

larsrh commented Jan 6, 2020

Example JSX code:

<span data-foo="&quot;" />

The relevant part of the ESTree:

openingElement: Node {
    type: 'JSXOpeningElement',
    attributes: [
    	Node {
        	type: 'JSXAttribute',
	        name: Node {
    		    type: 'JSXIdentifier',
		        name: 'data-foo'
		    },
        	value: Node {
		        type: 'Literal',
		        value: '"',
        		raw: '"&quot;"' // <-- problem is here
        	}
	    }
    ],
    // ...
}

I believe the raw property of the literal is wrong. When transforming such a tree and preserving the literals as-is, some code generators (e.g. astring) will prefer the raw property and emit it as is:

https://github.com/davidbonnet/astring/blob/92d26a05f666fa4f7a3475df67773581c1dff9a0/src/astring.js#L938-L940

This may lead to generated code that looks as follows:

React.createElement(
  "span",
  { "data-foo": '"&quot;"' }
)

Naturally this is wrong, because React and other tools will escape values by themselves. Interestingly enough, escodegen appears to ignore the raw property.

The naive solution would be to nuke all raw properties nested within any JSX node, but that would be a little more than necessary:

<span data-foo={ "bla" } />

Here, the escaping rules are regular JS rules.

My proposal would be to remove raw from all literals that are nested directly below an attribute.

The following situation is not affected:

<span>&</span>

... for the sole reason that text nested in elements are not parsed as literals, but rather as the separate JSXText node type.

@Qix-
Copy link

Qix- commented Dec 10, 2020

I don't understand why acorn-jsx is transforming HTML entities at all. That is up for the user agent to do, not the JSX parser.

I have in my JSX &lt;/&gt; used within an icon. Parsing this within a bundler and then feeding it into the JSX compiler I'm using (not React in my case) is causing the compiler to think there's a literal </> there, when in fact it's simply text intended for the user agent in particular.

What is the use-case of transforming HTML entities? JSX isn't HTML; the original, un-altered text should be put into the AST nodes, not arbitrarily transformed text. HTML entities are not string escapes as per either the Ecmascript standard nor any of the JSX "standards".

This appears to happen even in normal text, too.

<div className={C.icon}><center>&lt;/&gt;</center></div>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants