Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: Add SourceMap.findOrigin #47790

Merged
merged 1 commit into from
Jun 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
68 changes: 56 additions & 12 deletions doc/api/module.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,23 +198,67 @@ Creates a new `sourceMap` instance.

Getter for the payload used to construct the [`SourceMap`][] instance.

#### `sourceMap.findEntry(lineNumber, columnNumber)`
#### `sourceMap.findEntry(lineOffset, columnOffset)`

* `lineNumber` {number}
* `columnNumber` {number}
* `lineOffset` {number} The zero-indexed line number offset in
the generated source
* `columnOffset` {number} The zero-indexed column number offset
in the generated source
* Returns: {Object}

Given a line number and column number in the generated source file, returns
an object representing the position in the original file. The object returned
consists of the following keys:

* generatedLine: {number}
* generatedColumn: {number}
* originalSource: {string}
* originalLine: {number}
* originalColumn: {number}
Given a line offset and column offset in the generated source
file, returns an object representing the SourceMap range in the
original file if found, or an empty object if not.

The object returned contains the following keys:

* generatedLine: {number} The line offset of the start of the
range in the generated source
* generatedColumn: {number} The column offset of start of the
range in the generated source
* originalSource: {string} The file name of the original source,
as reported in the SourceMap
* originalLine: {number} The line offset of the start of the
range in the original source
* originalColumn: {number} The column offset of start of the
range in the original source
Comment on lines +215 to +224
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The column limit is 130 now, I think. You could use more space to make this more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes the most sense to follow the conventions in the file when adding new content. It's a bit weird to have some of the docs wrapped at 130, and others at 80, no? And even worse to have a commit that touches the whole file when only a single section was added/edited.

Better to reformat markdown as a subsequent PR, or better yet, reformat all the markdown in the project in one atomic commit that does nothing else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’s more readable when wrapped at 80 chars, at least it is on the GitHub web UI because they wrap suggestions at 80 chars 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly, what we need is more input on this important matter. 😅

* name: {string}

The returned value represents the raw range as it appears in the
SourceMap, based on zero-indexed offsets, _not_ 1-indexed line and
column numbers as they appear in Error messages and CallSite
objects.

To get the corresponding 1-indexed line and column numbers from a
lineNumber and columnNumber as they are reported by Error stacks
and CallSite objects, use `sourceMap.findOrigin(lineNumber,
columnNumber)`

#### `sourceMap.findOrigin(lineNumber, columnNumber)`

* `lineNumber` {number} The 1-indexed line number of the call
site in the generated source
* `columnOffset` {number} The 1-indexed column number
of the call site in the generated source
* Returns: {Object}

Given a 1-indexed lineNumber and columnNumber from a call site in
the generated source, find the corresponding call site location
in the original source.

If the lineNumber and columnNumber provided are not found in any
source map, then an empty object is returned. Otherwise, the
returned object contains the following keys:

* name: {string | undefined} The name of the range in the
source map, if one was provided
* fileName: {string} The file name of the original source, as
reported in the SourceMap
* lineNumber: {number} The 1-indexed lineNumber of the
corresponding call site in the original source
* columnNumber: {number} The 1-indexed columnNumber of the
corresponding call site in the original source

[CommonJS]: modules.md
[ES Modules]: esm.md
[Source map v3 format]: https://sourcemaps.info/spec.html#h.mofvlxcwqzej
Expand Down
42 changes: 34 additions & 8 deletions lib/internal/source_map/source_map.js
Original file line number Diff line number Diff line change
Expand Up @@ -169,28 +169,28 @@ class SourceMap {
};

/**
* @param {number} lineNumber in compiled resource
* @param {number} columnNumber in compiled resource
* @return {?Array}
* @param {number} lineOffset 0-indexed line offset in compiled resource
* @param {number} columnOffset 0-indexed column offset in compiled resource
* @return {object} representing start of range if found, or empty object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object type doesn't say much so I think it's better to declare an object with the properties that will/might be returned.

For example:

/**
 * @returns {{
 *   foo: string;
 *   bar?: number;
 * }}
 */

*/
findEntry(lineNumber, columnNumber) {
findEntry(lineOffset, columnOffset) {
let first = 0;
let count = this.#mappings.length;
while (count > 1) {
const step = count >> 1;
const middle = first + step;
const mapping = this.#mappings[middle];
if (lineNumber < mapping[0] ||
(lineNumber === mapping[0] && columnNumber < mapping[1])) {
if (lineOffset < mapping[0] ||
(lineOffset === mapping[0] && columnOffset < mapping[1])) {
count = step;
} else {
first = middle;
count -= step;
}
}
const entry = this.#mappings[first];
if (!first && entry && (lineNumber < entry[0] ||
(lineNumber === entry[0] && columnNumber < entry[1]))) {
if (!first && entry && (lineOffset < entry[0] ||
(lineOffset === entry[0] && columnOffset < entry[1]))) {
return {};
} else if (!entry) {
return {};
Expand All @@ -205,6 +205,32 @@ class SourceMap {
};
}

/**
* @param {number} lineNumber 1-indexed line number in compiled resource call site
* @param {number} columnNumber 1-indexed column number in compiled resource call site
* @return {object} representing origin call site if found, or empty object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

*/
findOrigin(lineNumber, columnNumber) {
const range = this.findEntry(lineNumber - 1, columnNumber - 1);
if (
range.originalSource === undefined ||
range.originalLine === undefined ||
range.originalColumn === undefined ||
range.generatedLine === undefined ||
range.generatedColumn === undefined
) {
return {};
}
const lineOffset = lineNumber - range.generatedLine;
const columnOffset = columnNumber - range.generatedColumn;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this PR out locally and noticed that if you create a source map from an existing coverage report some inputs return negative column numbers. Is this function expected to work with the source map data generated via NODE_V8_COVERAGE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's weird. No, the column number should be at minimum 1. Can you share the steps you used to get to a negative column number? Maybe you're passing it offsets instead of column numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yes, I'm using this now in tap on coverage map data produced by V8, and it seems to work fine, so if you found a weird edge case, I'm very interested.

Copy link
Contributor

@cjihrig cjihrig May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isaacs - I'll have to see if I still have the code on a branch, but the gist of it was:

Given a line in the generated file, I was trying to find all of the corresponding lines in the original source file. I created a SourceMap from the V8 coverage data and called map.findOrigin(row, 1) and map.findOrigin(row, Infinity). Note - I'm not sure if Infinity should be supported here, but I did try valid finite numbers. Some of the returned values had negative column offsets. Using zero-based rows and columns, this did seem to work correctly with map.findEntry(), including passing Infinity.

I think this is related to the fact that V8 ranges can start with whitespace on a non-zero column (for example around the { in if (foo) {).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that findOrigin(row, Infinity) would produce a very strange value indeed, but I'd expect a column of Infinity rather than a negative value.

What it's doing is calculating the row and column deltas from the start of the generated line/column offsets to the 1-indexed line/column numbers you give it, then applying that same delta to the source line/column values from the map entry. So if you told it your origin is line: 100, column: Infinity then findEntry should give you a generatedLine of some number less than 100 (whatever line that range starts on), and a generatedColumn of the 0-index start of that range. The line offset then would be some number less than 100, and the column offset would be Infinity - (some finite n), so Infinity. The source line and column would be some arbitrary number corresponding to the start of that range, and then it would add the line delta (some number less than 100) to the entry's sourceLine, and the column delta (Infinity) to the entry sourceColumn.

I'm not sure just looking for line, 0 and line, Infinity would give you all ranges that affect that line, though, since more than 2 ranges can exist on a given line, and they might be arbitrarily intermixed.

As I describe this, I'm realizing that this strategy might only work because TypeScript doesn't change token lengths, though.

Like, let's say you had a source like

const someLongName = otherLongName.x()

And that got minified to

const _l=_o.x()

And let's say the error being thrown is that otherLongName/_o is undefined, so _o.x() throws an error.

Lines are the same, so no issue there.

But the range will start at {generatedColumn:0,sourceColumn:0}, and the error origin callsite will be column 13:

can't call method x() of undefined blah blah blah
const _l=_o.x()
            ^

If we add 13 to the sourceColumn, that isn't going to point at the right spot, and you'll get an output like:

const someLongName = otherLongName.x()
            ^

It'll work if the minifier is smart enough to make a new range where the two start matching, then it's fine, but I think the heuristic might have to be a little bit smarter, take into account the end of the range as well, and idk, say "It's probably somewhere in the general vicinity of these characters, good luck".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so the problem is with findEntry. There's no entry in the sourcemap that covers offset 10, 0. So, findEntry returns the entry at 9,30, since it uses the highest start of the range that is less than the provided values.

The delta from the range 9,30 to the call site location 11,1 is +2,-29. But really, that's invalid, because 10, 0 isn't covered by the range that starts at offset 9,30. There's no way to ever have a call site that references line 11 column 1, and even if it did, that doesn't map to anything in the source file. It's an invalid input, so you get an invalid result.

Similarly, the range offset 10, Infinity (incorrectly) returns the range at 10, 1 (because Infinity is greater than the last range in line 9, and 9 is less than the first offset in line 10, so you get that one). This maps to the source offset of 9,1. The delta from 10, 1 to 11, Infinity is +1, +Infinity, so you get a resulting origin call site of 10, Infinity.

I think that findEntry should return an empty object (like it does when it can't find a range) if the provided offsets are outside the end of the offset it finds. The code seems to assume that there's no un-mapped ranges in the generated file, and that's not the case.

Also, just to un-XY this problem, if your goal is "find all the lines in the origin source that map to a given line in the generated source", this strategy of checking column 1 and column Infinity is not ever going to work reliably. Any line in the generated source may be composed of an arbitrary number of entries from an arbitrary set of locations within the origin source, so if you only test the first and last in the line, you'll miss everything in the middle, and there's no guarantee that they line up. Consider, for example, a munger that shuffles functions around, or a minifier that puts every line in the origin into a single line in the output.

What you need to do is get the set of offset ranges from the sourcemap entries that have a line offset of line - 1, and then add 1 to each of the origin lines of those entries, and that'd give you the result. However, since sourceMap.#mappings is private, and only exposed via the very limited findEntry method, that's not possible without parsing the source map in userland, unless we also add a way to get at the raw entries themselves.

Copy link
Contributor Author

@isaacs isaacs May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ie, suggesting this change to findEntry, to have it return an empty range if the supplied offsets are not within the entry it finds (haven't tested, this might break other stuff, idk)

diff --git a/lib/internal/source_map/source_map.js b/lib/internal/source_map/source_map.js
index 3112a026b64..08ad65b919a 100644
--- a/lib/internal/source_map/source_map.js
+++ b/lib/internal/source_map/source_map.js
@@ -189,10 +189,7 @@ class SourceMap {
       }
     }
     const entry = this.#mappings[first];
-    if (!first && entry && (lineOffset < entry[0] ||
-        (lineOffset === entry[0] && columnOffset < entry[1]))) {
-      return {};
-    } else if (!entry) {
+    if (!entry || lineOffset !== entry[0] || columnOffset < entry[1]) {
       return {};
     }
     return {

With that, it's much more clear that the approach of testing column 1 and column Infinity is invalid, because you walked off the source map.

Copy link
Contributor Author

@isaacs isaacs May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's only half of the fix, because we're not tracking the end of the range in this.#mappings, only the start, so the test program you shared will find 2 obviously missing results, and 2 that are confusingly infinite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just to un-XY this problem, if your goal is "find all the lines in the origin source that map to a given line in the generated source", this strategy of checking column 1 and column Infinity is not ever going to work reliably.

So this was just me doing some random testing, not an actual goal. I agree that it won't work for anything useful - a generated file can be a single line, which would then map to every line in the original source, making it pretty useless for something like code coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can definitely see some potential for something like "from l1, c1 to l2, c2, find all the ranges in the origin that are affected", for which "find all the origin lines that went into line X in generated source" is sort of a simplified case. But we can't do that without parsing out the ends of the ranges from the sourcemap payload.

return {
name: range.name,
fileName: range.originalSource,
lineNumber: range.originalLine + lineOffset,
columnNumber: range.originalColumn + columnOffset,
};
}

/**
* @override
*/
Expand Down
27 changes: 25 additions & 2 deletions test/parallel/test-source-map-api.js
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,14 @@ const { readFileSync } = require('fs');
assert.strictEqual(originalLine, 2);
assert.strictEqual(originalColumn, 4);
assert(originalSource.endsWith('disk.js'));
const {
fileName,
lineNumber,
columnNumber,
} = sourceMap.findOrigin(1, 30);
assert.strictEqual(fileName, originalSource);
assert.strictEqual(lineNumber, 3);
assert.strictEqual(columnNumber, 6);
}

// findSourceMap() can be used in Error.prepareStackTrace() to lookup
Expand Down Expand Up @@ -89,6 +97,18 @@ const { readFileSync } = require('fs');
assert.strictEqual(originalLine, 17);
assert.strictEqual(originalColumn, 10);
assert(originalSource.endsWith('typescript-throw.ts'));

const {
fileName,
lineNumber,
columnNumber,
} = sourceMap.findOrigin(
callSite.getLineNumber(),
callSite.getColumnNumber()
);
assert.strictEqual(fileName, originalSource);
assert.strictEqual(lineNumber, 18);
assert.strictEqual(columnNumber, 11);
}

// SourceMap can be instantiated with Source Map V3 object as payload.
Expand All @@ -112,8 +132,8 @@ const { readFileSync } = require('fs');
assert.notStrictEqual(payload.sources, sourceMap.payload.sources);
}

// findEntry() must return empty object instead error when
// receive a malformed mappings.
// findEntry() and findOrigin() must return empty object instead of
// error when receiving a malformed mappings.
{
const payload = JSON.parse(readFileSync(
require.resolve('../fixtures/source-map/disk.map'), 'utf8'
Expand All @@ -124,6 +144,9 @@ const { readFileSync } = require('fs');
const result = sourceMap.findEntry(0, 5);
assert.strictEqual(typeof result, 'object');
assert.strictEqual(Object.keys(result).length, 0);
const origin = sourceMap.findOrigin(0, 5);
assert.strictEqual(typeof origin, 'object');
assert.strictEqual(Object.keys(origin).length, 0);
}

// SourceMap can be instantiated with Index Source Map V3 object as payload.
Expand Down