JSX Automatic Runtime Transform Technical Plan
This document goes into detail on the plan for the new "automatic runtime" JSX transform, tracked at https://github.com/alangpierce/sucrase/issues/585 , and works through some of the open questions and technical challenges mentioned in that issue. Any comments or feedback can be made on that issue.
It looks like there are a few names:
- The announcement post https://reactjs.org/blog/2020/09/22/introducing-the-new-jsx-transform.html simply refers to it as "the new JSX transform" (which isn't necessarily trying to be a technical name).
- Babel, swc, esbuild, and Flow all use the terminology "automatic runtime", configured as
runtime: automatic
or something very similar, presumably because the import/require is automatically added. The old transform is referred to as the "classic runtime". - TypeScript docs refer to it as the "React 17 transform" and it's configured via
jsx: react-jsx
andjsx: react-jsxdev
.
It seems like "automatic" or "automatic runtime" is probably the most understandable way to refer to the transform, despite TypeScript never using that term (as far as I can tell).
One detail worth keeping in mind is that this transform is actually designed as an intermediate step in a multi-year transition: https://github.com/facebook/react/issues/20031#issuecomment-710346866 , and the long-term vision is that there may be a third (hopefully simpler) JSX transform coming at some point in the future. That one will presumably also be "automatic", but I'm sure there will be a reasonable name for that in the future.
There are a few approaches that come to mind:
- Add a new transform name, like
jsx-automatic
, which is mutually exclusive withjsx
.- Within this option, there's the question of whether to distinguish dev vs prod modes in the transform name vs elsewhere.
- This option most directly conveys it as a different transform, though it's not obvious whether it should be framed that way; there are significant differences but also significant similarities.
- A downside to this approach is that it's more awkward to make this transform the default in a future breaking change. It would probably mean renaming
jsx
tojsx-classic
andjsx-automatic
tojsx
, which seems more disruptive than other approaches.
- Add a new top-level option like
jsxRuntime
that could be"automatic"
or"classic"
. For now, it would default to"classic"
.- A variation here is a boolean option like
useAutomaticJSXRuntime
, though that's a bit less future-proof, so the string-based approach feels nicer to me. - An advantage is that this would be familiar to anyone configuring Babel, swc, or other tools.
- Another advantage is that this should make it easier to configure integration code that's already looking at the file extension to decide whether
jsx
should be enabled. The decision of whether to use JSX would be decoupled with the decision of how JSX should be transpiled. - Another advantage is that this makes a future
"preserve"
option fairly natural, which I know is a use case that Sucrase doesn't handle right now. This makes the option quite similar to the TypeScriptjsx
option.
- A variation here is a boolean option like
- Consolidate all JSX-related options into a nested key, like
{transforms: ["jsx"], jsxOptions: {runtime: "automatic", importSource: "some-other-library"}}
. swc takes an approach like this.- This would be cleaner in some sense, though my hope is that Sucrase has few enough options that categorizing them into groups isn't necessary.
My leaning now is the jsxRuntime
option for the reasons listed.
Just like the existing options jsxPragma
and jsxFragmentPragma
, the plan is to add a new option jsxImportSource
. This seems like a straightforward decision and roughly matches the approach from other tools.
Currently, Sucrase already has a production
flag for the classic JSX transform that decides whether to include or exclude debug information. The most straightforward approach is to reuse this same flag. To explore the API design space a bit, there are a few potential alternatives:
- A separate transform name or separate runtime name for development vs production. TypeScript takes this approach.
- A
development
flag (defaultfalse
) rather than aproduction
flag. Babel, swc, and esbuild take this approach. - An
environment
setting that takes string options.
For the near term, I think the existing production
flag is the most practical approach, and it seems nice to decouple it from the decision of whether to use JSX or not.
The main concern with defaulting to the development transform is that React will crash when using the new dev transform against a prod build of React. This could lead to frustrating surprises when people test locally, deploy to production, and see that it's broken in production. I think this concern can be addressed in a few ways:
- Make sure the docs emphasize the need to configure this correctly.
- When changing the new JSX transform to be the default, also change
production
to default to true. - Automatically set dev vs prod defaults on an integration-by-integration basis. For example,
sucrase/register
is most reasonable to use for development, so it could default to the dev transform.
The new transform uses 6 new names that need to be auto-imported as needed:
import {jsx, jsxs, Fragment} from "${jsxImportSource}/jsx-runtime";
import {jsxDEV, Fragment} from "${jsxImportSource}/jsx-dev-runtime";
import {createElement} from "${jsxImportSource}";
(This is informal ESM syntax based on jsxImportSource
, but in practice the auto-import will need to be either ESM or CJS based on whether the imports
transform is also enabled.)
-
Fragment
has the same behavior as before, and we need to import the right one based on dev vs prod mode. -
jsx
andjsxs
are called in the prod transform for typical cases. -
jsxDEV
is called in the dev transform for typical cases. -
createElement
has the same behavior as the oldReact.createElement
. It is a fallback function to avoid a difference in behavior when combining prop spread with an explicit key. Any expression like<div {...props} key="some-key" />
(with an explicit key after a prop spread) must compile tocreateElement
, and all other expressions must compile to one of thejsx
functions. The rationale is described at https://github.com/facebook/react/issues/20031#issuecomment-710346866 . - For all
jsx...
functions:- The transform must detect if the element has "static" children, i.e. has at least two children explicitly specified. This determines whether to call
jsx
orjsxs
and whether to passtrue
orfalse
tojsxDEV
. - The transform must now pass JSX children as the last prop, using an array if and only if there are at least two children.
- The transform must move the
key
expression to its own argument after the props. Note that this changes execution order, but it sounds like the React team considered this difference minor enough to call it non-breaking.
- The transform must detect if the element has "static" children, i.e. has at least two children explicitly specified. This determines whether to call
I believe the biggest technical challenge with implementing this in Sucrase is the requirement that the key
expression be reordered to after the rest of the props and children, combined with the requirement that Sucrase preserve line numbers in output. For example, if we're not careful, debugger breakpoints in onClick
may break here because the key
expression now needs to appear after the <button>
, shifting all <button>
lines up in the output code:
return (
<div key={
// This is an explanation of why we're using this key.
"hi"
}>
<button
onClick={() => {
console.log("Clicked!");
}}
/>
</div>
);
There are many potential ways to address this problem. A few main options stand out:
- Ignore the issue and simply move the code, including newlines. This means that all line numbers between the two code positions will be shifted, but hopefully this would be a relatively minor impact on the dev experience because multiline keys are rare in practice.
- I believe they are rare, but they aren't unreasonable. The above snippet gives an example situation that feels realistic to me.
- The usability edge case could be fixed by rethinking Sucrase's approach to line mappings. Currently, Sucrase preserves line numbers and emits a simple source map, but it could switch to the more typical approach by other transpilers where the source maps are more detailed. My main worry is that this bookkeeping would significantly slow down Sucrase.
- Implement a way to consolidate an arbitrary JS expression to one line. Emit all newlines where the key expression was originally, and remove all newlines from the moved key expression.
- This ends up being very challenging in the general case. Even with a reliable way to remove
//
comments and change template string newlines into explicit\n
, an expression might have a arrow function body that relies on ASI and would break if the newlines were removed. It might be ok for the transform to be wrong in these most obscure cases, but it's certainly not ideal.
- This ends up being very challenging in the general case. Even with a reliable way to remove
- Call a separate helper function that translates object-style props into a call to
jsx...
by extracting thekey
prop and passing it separately. This can be limited to just situations where thekey
prop has an expression that's more than one line.- This approach feels like it's not in the spirit of the new JSX transform since it's effectively just doing a runtime translation and misses out on some performance benefit of the new transform. That said, the new transform spec already has a
createElement
fallback that has this same downside. (An alternative option is to just callcreateElement
, though my assumption is that thecreateElement
fallback is meant to only be used for the key-after-prop-spread case.) - This would cause a user-visible difference from other JSX implementations because it changes the evaluation order of the
key
prop to be inline, whereas other JSX implementations move the key expression evaluation to after the children. This could be particularly surprising because a whitespace-only code change could cause a change in code behavior.- A way to avoid changing the evaluation order is to pass an arrow function for the key expression to delay evaluation until after the props and children are evaluated. This works for most cases, but gets much more difficult if
await
appears in the expression, which also came up in the optional chaining implementation. Async functions with JSX should be very rare, but the hope is to find something that is fully correct. - The new JSX transform was considered non-breaking despite evaluation order differences from the old JSX transform, so maybe that means that it would be ok for Sucrase to differ in behavior here since it's unlikely to matter in realistic code.
- A way to avoid changing the evaluation order is to pass an arrow function for the key expression to delay evaluation until after the props and children are evaluated. This works for most cases, but gets much more difficult if
- While this avoids some of the challenges of the other approaches, it's still pretty messy. Most likely, each variation of
jsx...
would need its own wrapper helper, and the parser would need a special case to detect a key expression and see if a newline character appears anywhere in that expression range.
- This approach feels like it's not in the spirit of the new JSX transform since it's effectively just doing a runtime translation and misses out on some performance benefit of the new transform. That said, the new transform spec already has a
Given the complexity and uncertainties of the workarounds, I think the most reasonable initial approach is to simply move the expression including its newlines. The line mapping issue is something that can be addressed in the future, maybe as an additional special case when improving source maps more generally.
Long-term, I'm hoping that a future version of the JSX transform addresses the out-of-order evaluation issue, which should naturally make this problem go away. One example syntax is <div:{getKey()} foo="bar" />
, or a requirement that key
(or @key
) be the first prop.
In terms of implementation, Sucrase currently never needs to move an expression, but it should be possible to bring back some old code that did it by taking a token processor snapshot, processing the code, slicing the result, restoring to snapshot, and then inserting the code later: https://github.com/alangpierce/sucrase/pull/313/files#diff-0701663bd49142a7472d3a58aa749e99a6315fb67ab5853ef14dfc89318d63afL95
The parser has some added responsibilities here:
- Detect if this is a "static" usage, i.e. one with at least two children.
- This can be done by changing
jsxParseElementAt
to count how many elements are encountered, then going back and modifying thejsxTagStart
token to include the fact that it was static. - Another subtle case here is that spread children (e.g.
<div>{...children}</div>
) are used in cases where the children are passed as an array but are intentionally declared static (e.g. a literal array with a fixed number of elements). This case should also be considered static, which matches the TypeScript behavior. - An additional requirement discovered during implementation: the transform step needs to know whether to treat
children
as an array, as a plain expression, or ifchildren
should be omitted entirely. In some situations, a JSX element may have multiple child text tokens that all resolve to empty, so it's useful for the parser to mark that ahead of time so that the transform step can know to omitchildren
without having to process all child tokens.
- This can be done by changing
- Detect elements that look like
<div {...props} key="hi" />
where an explicitkey
appears after a spread operator. This case needs to fall back to regularcreateElement
for compatibility reasons.- This can be done by refactoring
jsxParseOpeningElement
to be able to track whether it has already seen a prop spread.
- This can be done by refactoring
- Potentially mark
key
props in a special way so the transformer doesn't need to do a string comparison on each prop to detect if it has the namekey
.-
processPropKeyName
in the transformer already accesses the prop name as a string, so currently there wouldn't be much of a performance benefit here, so the current plan is for the parser to not do anything smart here.
-
To mark the special jsxTagStart
cases, two options come to mind:
- Add some additional token types like
jsxTagStartStatic
andjsxTagStartFallback
that behave just likejsxTagStart
- Add a new
Token
field likejsxRole
with an enum likeNormal
,StaticChildren
, andFallback
. This could potentially be overloaded withidentifierRole
in the future to save space on theToken
object.
Current leaning is the second option; having to check multiple token types whenever we traverse JSX seems more error prone. Both could potentially be optimized in the future.
The old and new transforms have both significant similarities and significant differences, so there are a few approaches to organizing the code:
- Separate
Transformer
classes, with plain functions for code sharing. - Separate
Transformer
classes with a common superclass for code sharing. - A single
JSXTransformer
class that implements both transforms.
I've gone back and forth on the different approaches when trying out prototypes, but I think a unified implementation might end up being the most understandable. That's how Babel structures the code, for example.
The testing matrix gets quite complex here, so there will definitely need to be thorough tests. There will need to be tests for at least these dimensions:
- Automatic vs classic transform
- Development vs production
- ESM vs CJS targets
- Number of children
- 0: don't include children prop at all
- 1: call non-static version of function, don't wrap in array
- 2+: call static version of function, wrap in array
- Usage of
key
- No
key
- Multiline
key
-
key
before spread: should call regular JSX function -
key
after spread: should fall back to createElement
- No