JSX Automatic Runtime Transform Technical Plan

This document goes into detail on the plan for the new "automatic runtime" JSX transform, tracked at https://github.com/alangpierce/sucrase/issues/585 , and works through some of the open questions and technical challenges mentioned in that issue. Any comments or feedback can be made on that issue.

API changes and terminology

What should the transform be called?

It looks like there are a few names:

The announcement post https://reactjs.org/blog/2020/09/22/introducing-the-new-jsx-transform.html simply refers to it as "the new JSX transform" (which isn't necessarily trying to be a technical name).
Babel, swc, esbuild, and Flow all use the terminology "automatic runtime", configured as runtime: automatic or something very similar, presumably because the import/require is automatically added. The old transform is referred to as the "classic runtime".
TypeScript docs refer to it as the "React 17 transform" and it's configured via jsx: react-jsx and jsx: react-jsxdev.

It seems like "automatic" or "automatic runtime" is probably the most understandable way to refer to the transform, despite TypeScript never using that term (as far as I can tell).

One detail worth keeping in mind is that this transform is actually designed as an intermediate step in a multi-year transition: https://github.com/facebook/react/issues/20031#issuecomment-710346866 , and the long-term vision is that there may be a third (hopefully simpler) JSX transform coming at some point in the future. That one will presumably also be "automatic", but I'm sure there will be a reasonable name for that in the future.

How should the transform be enabled?

There are a few approaches that come to mind:

Add a new transform name, like jsx-automatic, which is mutually exclusive with jsx.
- Within this option, there's the question of whether to distinguish dev vs prod modes in the transform name vs elsewhere.
- This option most directly conveys it as a different transform, though it's not obvious whether it should be framed that way; there are significant differences but also significant similarities.
- A downside to this approach is that it's more awkward to make this transform the default in a future breaking change. It would probably mean renaming jsx to jsx-classic and jsx-automatic to jsx, which seems more disruptive than other approaches.
Add a new top-level option like jsxRuntime that could be "automatic" or "classic". For now, it would default to "classic".
- A variation here is a boolean option like useAutomaticJSXRuntime, though that's a bit less future-proof, so the string-based approach feels nicer to me.
- An advantage is that this would be familiar to anyone configuring Babel, swc, or other tools.
- Another advantage is that this should make it easier to configure integration code that's already looking at the file extension to decide whether jsx should be enabled. The decision of whether to use JSX would be decoupled with the decision of how JSX should be transpiled.
- Another advantage is that this makes a future "preserve" option fairly natural, which I know is a use case that Sucrase doesn't handle right now. This makes the option quite similar to the TypeScript jsx option.
Consolidate all JSX-related options into a nested key, like {transforms: ["jsx"], jsxOptions: {runtime: "automatic", importSource: "some-other-library"}}. swc takes an approach like this.
- This would be cleaner in some sense, though my hope is that Sucrase has few enough options that categorizing them into groups isn't necessary.

My leaning now is the jsxRuntime option for the reasons listed.

How should the import be customized?

Just like the existing options jsxPragma and jsxFragmentPragma, the plan is to add a new option jsxImportSource. This seems like a straightforward decision and roughly matches the approach from other tools.

How should dev and prod be distinguished?

Currently, Sucrase already has a production flag for the classic JSX transform that decides whether to include or exclude debug information. The most straightforward approach is to reuse this same flag. To explore the API design space a bit, there are a few potential alternatives:

A separate transform name or separate runtime name for development vs production. TypeScript takes this approach.
A development flag (default false) rather than a production flag. Babel, swc, and esbuild take this approach.
An environment setting that takes string options.

For the near term, I think the existing production flag is the most practical approach, and it seems nice to decouple it from the decision of whether to use JSX or not.

The main concern with defaulting to the development transform is that React will crash when using the new dev transform against a prod build of React. This could lead to frustrating surprises when people test locally, deploy to production, and see that it's broken in production. I think this concern can be addressed in a few ways:

Make sure the docs emphasize the need to configure this correctly.
When changing the new JSX transform to be the default, also change production to default to true.
Automatically set dev vs prod defaults on an integration-by-integration basis. For example, sucrase/register is most reasonable to use for development, so it could default to the dev transform.

Overview of changes in new transform

The new transform uses 6 new names that need to be auto-imported as needed:

import {jsx, jsxs, Fragment} from "${jsxImportSource}/jsx-runtime";
import {jsxDEV, Fragment} from "${jsxImportSource}/jsx-dev-runtime";
import {createElement} from "${jsxImportSource}";

(This is informal ESM syntax based on jsxImportSource, but in practice the auto-import will need to be either ESM or CJS based on whether the imports transform is also enabled.)

Fragment has the same behavior as before, and we need to import the right one based on dev vs prod mode.
jsx and jsxs are called in the prod transform for typical cases.
jsxDEV is called in the dev transform for typical cases.
createElement has the same behavior as the old React.createElement. It is a fallback function to avoid a difference in behavior when combining prop spread with an explicit key. Any expression like <div {...props} key="some-key" /> (with an explicit key after a prop spread) must compile to createElement, and all other expressions must compile to one of the jsx functions. The rationale is described at https://github.com/facebook/react/issues/20031#issuecomment-710346866 .
For all jsx... functions:
- The transform must detect if the element has "static" children, i.e. has at least two children explicitly specified. This determines whether to call jsx or jsxs and whether to pass true or false to jsxDEV.
- The transform must now pass JSX children as the last prop, using an array if and only if there are at least two children.
- The transform must move the key expression to its own argument after the props. Note that this changes execution order, but it sounds like the React team considered this difference minor enough to call it non-breaking.

Handling multi-line keys

I believe the biggest technical challenge with implementing this in Sucrase is the requirement that the key expression be reordered to after the rest of the props and children, combined with the requirement that Sucrase preserve line numbers in output. For example, if we're not careful, debugger breakpoints in onClick may break here because the key expression now needs to appear after the <button>, shifting all <button> lines up in the output code:

return (
  <div key={
    // This is an explanation of why we're using this key.
    "hi"
  }>
    <button
      onClick={() => {
        console.log("Clicked!");
      }}
    />
  </div>
);

There are many potential ways to address this problem. A few main options stand out:

Ignore the issue and simply move the code, including newlines. This means that all line numbers between the two code positions will be shifted, but hopefully this would be a relatively minor impact on the dev experience because multiline keys are rare in practice.
- I believe they are rare, but they aren't unreasonable. The above snippet gives an example situation that feels realistic to me.
- The usability edge case could be fixed by rethinking Sucrase's approach to line mappings. Currently, Sucrase preserves line numbers and emits a simple source map, but it could switch to the more typical approach by other transpilers where the source maps are more detailed. My main worry is that this bookkeeping would significantly slow down Sucrase.
Implement a way to consolidate an arbitrary JS expression to one line. Emit all newlines where the key expression was originally, and remove all newlines from the moved key expression.
- This ends up being very challenging in the general case. Even with a reliable way to remove // comments and change template string newlines into explicit \n, an expression might have a arrow function body that relies on ASI and would break if the newlines were removed. It might be ok for the transform to be wrong in these most obscure cases, but it's certainly not ideal.
Call a separate helper function that translates object-style props into a call to jsx... by extracting the key prop and passing it separately. This can be limited to just situations where the key prop has an expression that's more than one line.
- This approach feels like it's not in the spirit of the new JSX transform since it's effectively just doing a runtime translation and misses out on some performance benefit of the new transform. That said, the new transform spec already has a createElement fallback that has this same downside. (An alternative option is to just call createElement, though my assumption is that the createElement fallback is meant to only be used for the key-after-prop-spread case.)
- This would cause a user-visible difference from other JSX implementations because it changes the evaluation order of the key prop to be inline, whereas other JSX implementations move the key expression evaluation to after the children. This could be particularly surprising because a whitespace-only code change could cause a change in code behavior.
  - A way to avoid changing the evaluation order is to pass an arrow function for the key expression to delay evaluation until after the props and children are evaluated. This works for most cases, but gets much more difficult if await appears in the expression, which also came up in the optional chaining implementation. Async functions with JSX should be very rare, but the hope is to find something that is fully correct.
  - The new JSX transform was considered non-breaking despite evaluation order differences from the old JSX transform, so maybe that means that it would be ok for Sucrase to differ in behavior here since it's unlikely to matter in realistic code.
- While this avoids some of the challenges of the other approaches, it's still pretty messy. Most likely, each variation of jsx... would need its own wrapper helper, and the parser would need a special case to detect a key expression and see if a newline character appears anywhere in that expression range.

Given the complexity and uncertainties of the workarounds, I think the most reasonable initial approach is to simply move the expression including its newlines. The line mapping issue is something that can be addressed in the future, maybe as an additional special case when improving source maps more generally.

Long-term, I'm hoping that a future version of the JSX transform addresses the out-of-order evaluation issue, which should naturally make this problem go away. One example syntax is <div:{getKey()} foo="bar" />, or a requirement that key (or @key) be the first prop.

In terms of implementation, Sucrase currently never needs to move an expression, but it should be possible to bring back some old code that did it by taking a token processor snapshot, processing the code, slicing the result, restoring to snapshot, and then inserting the code later: https://github.com/alangpierce/sucrase/pull/313/files#diff-0701663bd49142a7472d3a58aa749e99a6315fb67ab5853ef14dfc89318d63afL95

Other technical details and decisions

Parser changes

The parser has some added responsibilities here:

Detect if this is a "static" usage, i.e. one with at least two children.
- This can be done by changing jsxParseElementAt to count how many elements are encountered, then going back and modifying the jsxTagStart token to include the fact that it was static.
- Another subtle case here is that spread children (e.g. <div>{...children}</div>) are used in cases where the children are passed as an array but are intentionally declared static (e.g. a literal array with a fixed number of elements). This case should also be considered static, which matches the TypeScript behavior.
- An additional requirement discovered during implementation: the transform step needs to know whether to treat children as an array, as a plain expression, or if children should be omitted entirely. In some situations, a JSX element may have multiple child text tokens that all resolve to empty, so it's useful for the parser to mark that ahead of time so that the transform step can know to omit children without having to process all child tokens.
Detect elements that look like <div {...props} key="hi" /> where an explicit key appears after a spread operator. This case needs to fall back to regular createElement for compatibility reasons.
- This can be done by refactoring jsxParseOpeningElement to be able to track whether it has already seen a prop spread.
Potentially mark key props in a special way so the transformer doesn't need to do a string comparison on each prop to detect if it has the name key.
- processPropKeyName in the transformer already accesses the prop name as a string, so currently there wouldn't be much of a performance benefit here, so the current plan is for the parser to not do anything smart here.

To mark the special jsxTagStart cases, two options come to mind:

Add some additional token types like jsxTagStartStatic and jsxTagStartFallback that behave just like jsxTagStart
Add a new Token field like jsxRole with an enum like Normal, StaticChildren, and Fallback. This could potentially be overloaded with identifierRole in the future to save space on the Token object.

Current leaning is the second option; having to check multiple token types whenever we traverse JSX seems more error prone. Both could potentially be optimized in the future.

How to structure the transform code

The old and new transforms have both significant similarities and significant differences, so there are a few approaches to organizing the code:

Separate Transformer classes, with plain functions for code sharing.
Separate Transformer classes with a common superclass for code sharing.
A single JSXTransformer class that implements both transforms.

I've gone back and forth on the different approaches when trying out prototypes, but I think a unified implementation might end up being the most understandable. That's how Babel structures the code, for example.

Cases to test

The testing matrix gets quite complex here, so there will definitely need to be thorough tests. There will need to be tests for at least these dimensions:

Automatic vs classic transform
Development vs production
ESM vs CJS targets
Number of children
- 0: don't include children prop at all
- 1: call non-static version of function, don't wrap in array
- 2+: call static version of function, wrap in array
Usage of key
- No key
- Multiline key
- key before spread: should call regular JSX function
- key after spread: should fall back to createElement

Provide feedback

Saved searches

Use saved searches to filter your results more quickly