Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define which points in JS should have a source map mapping #38

Open
littledan opened this issue Apr 25, 2023 · 10 comments
Open

Define which points in JS should have a source map mapping #38

littledan opened this issue Apr 25, 2023 · 10 comments

Comments

@littledan
Copy link
Member

Different tools here make different decisions, sometimes leading to lossiness when multiple levels in a build pipeline apply. If we make. definition here, it would lend itself to better testing.

@robpalme
Copy link

Here's the explainer for this issue.

Background

The most common API for transitively combining a chain of two sourcemaps to produce a single sourcemap is applySourcemap()

The Problem

Unambiguously mapping a coordinate in generated code all the way back to the origin in a reliable way depends on that point being present in each sourcemap in the chain. applySourcemap results in a map that loses fidelity if any points are not present.

Imagine a low resolution identity sourcemap A that only has mappings for the start and end of the range. And a higher resolution sourcemap B that includes an intermediate point.

?    ?
 +---------+
A|1        |
 +----+----+
B|1   |2   |
 +----+----+
      ^
      |

Currently applySourceMap will map the intermediate point below B to the first question mark.

Credit to @mariusGundersen for explaining this

Workarounds

For our internal toolchain we mitigate this by ensuring each source transform produces an overly high-resolution sourcemap (close to per-token mappings) to maximize the number of mapping points that can be accurately followed all the way through. The final result is then validated for accuracy.

Potential Solution

If there were a specification of the important boundaries in the JS and TS grammars for which mapping points ought to be generated, per-transform sourcemap producers could comply with this to guarantee the composability of those sourcemaps whilst preserving accuracy and minimizing redundant mappings.

Additionally we could consider specifying a new canonical sourcemap composition algorithm if we judge applySourcemap to be non-optimal.

@jridgewell
Copy link
Member

Copying a comment I left on Babel last year:

The easiest sourcemaps to generate just mark the beginnings of identifiers, and completely ignore any syntax. The debugging experience will be better with } and now the ( marked. But really, nothing else is strictly necessary. So there's no guarantee that the whitespace/syntax/X directly following an identifier name will be marked.

@sjrd
Copy link

sjrd commented Apr 26, 2023

To provide an opposite experience, Scala.js emits precise source maps. Every AST node from the initial parsing maintains a position. Positions are maintained through transformations, and are eventually attached to every node of the JS AST. Every node gives its position to the range of JS characters that it produces. So an addition like a + b maintains three positions: a, + and b.

Emitting less accurate source maps would ironically require more work, for us, I think.

@mitsuhiko
Copy link
Contributor

I would already love to just have a hint in the source map itself of what I get to expect mapping wise. That at least would provide tools with the ability to better tell the user what is going on.

@mariusGundersen
Copy link

What are the downsides to having positions for each token? It increases the size of the the sourcemap and maybe the processing time? Not all tokens need to be individually represented in the sourcemap, I guess.

@mariusGundersen
Copy link

BTW, this is not just a problem in js, it's also an issue when compiling less into css. Because of how less and postcss produce sourcemaps, nested declarations end up being mapped to the outer declaration. Postcss treats the entire rule as one position while less maps each part of the rule to different positions. For a nested rule the first token maps to a different line thanks the second token. But postcss uses the location of the first token. That means that most nested declarations point to line 1, column 1 of a less file.

@robpalme
Copy link

What are the downsides to having positions for each token?

I should clarify. Whilst per-token positions might be inefficient, the real goal is to simply agree on a definition of the required mapping points to ensure end-to-end accuracy. If we agree that per-token is the way to go, that would achieve the goal.

@jaro-sevcik
Copy link
Contributor

Webpack also offers the possibility of "cheap" source maps (via the devtool config option). Those only include line mappings, but not token mappings. As far as I know, this is the default in create-react-app. In Chrome Devtools, we had bugs related to this config (e.g., crbug.com/1422883).

As a result, we should consider specifying line-by-line source maps, and describing how tools should detect and handle those.

@jkup
Copy link
Collaborator

jkup commented Oct 11, 2023

I know this is a correctness issue in the sense that it's unspecified in the spec, but I'm curious if we should re-categorize this as a feature as it will probably take a good amount of work to implement?

@jkup
Copy link
Collaborator

jkup commented Jan 10, 2024

Some options mentioned in an earlier call:

  1. Every line
  2. Every token
  3. Every breakable position

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants