New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(resolution): Fix resolution of querystringed imports in ESM files #336
Changes from 12 commits
aebf66c
fb93620
e0bfe0d
77c83bd
a8c294d
142658c
8c3ec63
54431db
cc660e2
bd5feee
a7c4822
b76cfeb
9eeed58
44f3d3f
253e60c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,3 +4,4 @@ out | |
coverage | ||
test/**/dist | ||
test/**/actual.js | ||
test/unit/cjs-querystring/who?what?idk!.js |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,17 +1,42 @@ | ||||||
import { NodeFileTraceOptions, NodeFileTraceResult, NodeFileTraceReasons, Stats, NodeFileTraceReasonType } from './types'; | ||||||
import { basename, dirname, extname, relative, resolve, sep } from 'path'; | ||||||
import { basename, dirname, extname, relative, resolve, sep, join } from 'path'; | ||||||
import * as url from 'url'; | ||||||
import fs from 'graceful-fs'; | ||||||
import analyze, { AnalyzeResult } from './analyze'; | ||||||
import resolveDependency, { NotFoundError } from './resolve-dependency'; | ||||||
import { isMatch } from 'micromatch'; | ||||||
import { sharedLibEmit } from './utils/sharedlib-emit'; | ||||||
import { join } from 'path'; | ||||||
import { Sema } from 'async-sema'; | ||||||
|
||||||
const fsReadFile = fs.promises.readFile; | ||||||
const fsReadlink = fs.promises.readlink; | ||||||
const fsStat = fs.promises.stat; | ||||||
|
||||||
type ParseSpecifierResult = { | ||||||
path: string; | ||||||
queryString: string | null | ||||||
} | ||||||
|
||||||
// Splits an ESM specifier into path and querystring (including the leading `?`). (If the specifier is CJS, | ||||||
// it is passed through untouched.) | ||||||
export function parseSpecifier(specifier: string, cjsResolve: boolean = true): ParseSpecifierResult { | ||||||
let path = specifier; | ||||||
let queryString = null; | ||||||
|
||||||
if (!cjsResolve) { | ||||||
const specifierUrl = url.parse(specifier); | ||||||
queryString = specifierUrl.search; | ||||||
|
||||||
if (specifierUrl.search) { | ||||||
path = specifier.replace(specifierUrl.search, ''); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could replace the wrong part of the URL if it had repeating content in the string. How about using the parsed result instead?
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't use I will rename the function to something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As for the repeating content, do you have examples of where this might go wrong? Afaik, as per URL standard, the first Another reason not to use new URL("../../src/mymodule", "file://").pathname === "/src/mymodule" // instead of "../../mymodule" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Did you add tests for those cases? How does Node.js handle those cases?
Ah yes, you're right.
Oh good catch! CI is failing though so something is still wrong. |
||||||
} else { | ||||||
path = specifier; | ||||||
} | ||||||
} | ||||||
|
||||||
return {path, queryString}; | ||||||
} | ||||||
|
||||||
function inPath (path: string, parent: string) { | ||||||
const pathWithSep = join(parent, sep); | ||||||
return path.startsWith(pathWithSep) && path !== pathWithSep; | ||||||
|
@@ -215,19 +240,21 @@ export class Job { | |||||
} | ||||||
|
||||||
private maybeEmitDep = async (dep: string, path: string, cjsResolve: boolean) => { | ||||||
// Only affects ESM dependencies | ||||||
const { path: strippedDep, queryString = '' } = parseSpecifier(dep, cjsResolve) | ||||||
let resolved: string | string[] = ''; | ||||||
let error: Error | undefined; | ||||||
try { | ||||||
resolved = await this.resolve(dep, path, this, cjsResolve); | ||||||
} catch (e1: any) { | ||||||
error = e1; | ||||||
try { | ||||||
if (this.ts && dep.endsWith('.js') && e1 instanceof NotFoundError) { | ||||||
if (this.ts && strippedDep.endsWith('.js') && e1 instanceof NotFoundError) { | ||||||
// TS with ESM relative import paths need full extensions | ||||||
// (we have to write import "./foo.js" instead of import "./foo") | ||||||
// See https://www.typescriptlang.org/docs/handbook/esm-node.html | ||||||
const depTS = dep.slice(0, -3) + '.ts'; | ||||||
resolved = await this.resolve(depTS, path, this, cjsResolve); | ||||||
const strippedDepTS = strippedDep.slice(0, -3) + '.ts'; | ||||||
resolved = await this.resolve(strippedDepTS + queryString, path, this, cjsResolve); | ||||||
error = undefined; | ||||||
} | ||||||
} catch (e2: any) { | ||||||
|
@@ -240,16 +267,19 @@ export class Job { | |||||
return; | ||||||
} | ||||||
|
||||||
if (Array.isArray(resolved)) { | ||||||
for (const item of resolved) { | ||||||
// ignore builtins | ||||||
if (item.startsWith('node:')) return; | ||||||
await this.emitDependency(item, path); | ||||||
// For simplicity, force `resolved` to be an array | ||||||
resolved = Array.isArray(resolved) ? resolved : [resolved]; | ||||||
for (let item of resolved) { | ||||||
// ignore builtins for the purposes of both tracing and querystring handling (neither Node | ||||||
// nor Webpack can handle querystrings on `node:xxx` imports). | ||||||
if (item.startsWith('node:')) return; | ||||||
|
||||||
// If querystring was stripped during resolution, restore it | ||||||
if (queryString && !item.endsWith(queryString)) { | ||||||
item += queryString; | ||||||
} | ||||||
} else { | ||||||
// ignore builtins | ||||||
if (resolved.startsWith('node:')) return; | ||||||
await this.emitDependency(resolved, path); | ||||||
|
||||||
await this.analyzeAndEmitDependency(item, path, cjsResolve); | ||||||
} | ||||||
} | ||||||
|
||||||
|
@@ -343,13 +373,23 @@ export class Job { | |||||
} | ||||||
|
||||||
async emitDependency (path: string, parent?: string) { | ||||||
if (this.processed.has(path)) { | ||||||
return this.analyzeAndEmitDependency(path, parent) | ||||||
} | ||||||
|
||||||
private async analyzeAndEmitDependency(rawPath: string, parent?: string, cjsResolve?: boolean) { | ||||||
|
||||||
// Strip the querystring, if any. (Only affects ESM dependencies.) | ||||||
const { path } = parseSpecifier(rawPath, cjsResolve) | ||||||
|
||||||
// Since different querystrings may lead to different results, include the full path | ||||||
// when noting whether or not we've already seen this path | ||||||
if (this.processed.has(rawPath)) { | ||||||
if (parent) { | ||||||
await this.emitFile(path, 'dependency', parent) | ||||||
} | ||||||
return | ||||||
}; | ||||||
this.processed.add(path); | ||||||
this.processed.add(rawPath); | ||||||
|
||||||
const emitted = await this.emitFile(path, 'dependency', parent); | ||||||
if (!emitted) return; | ||||||
|
@@ -368,18 +408,34 @@ export class Job { | |||||
|
||||||
let analyzeResult: AnalyzeResult; | ||||||
|
||||||
const cachedAnalysis = this.analysisCache.get(path); | ||||||
// Since different querystrings may lead to analyses, include the full path when caching | ||||||
const cachedAnalysis = this.analysisCache.get(rawPath); | ||||||
if (cachedAnalysis) { | ||||||
analyzeResult = cachedAnalysis; | ||||||
} | ||||||
else { | ||||||
const source = await this.readFile(path); | ||||||
if (source === null) throw new Error('File ' + path + ' does not exist.'); | ||||||
// By default, `this.readFile` is the regular `fs.readFile`, but a different implementation | ||||||
// can be specified via a user option in the main `nodeFileTrace` entrypoint. Depending on | ||||||
// that implementation, having a querystring on the end of the path may either a) be necessary | ||||||
// in order to specify the right module from which to read, or b) lead to an `ENOENT: no such | ||||||
// file or directory` error because it thinks the querystring is part of the filename. We | ||||||
// therefore try it with the querystring first, but have the non-querystringed version as a | ||||||
// fallback. (If there's no `?` in the given path, `rawPath` will equal `path`, so order is a | ||||||
// moot point.) | ||||||
const source = await this.readFile(rawPath) || await this.readFile(path) ; | ||||||
|
||||||
if (source === null) { | ||||||
const errorMessage = path === rawPath | ||||||
? 'File ' + path + ' does not exist.' | ||||||
: 'Neither ' + path + ' nor ' + rawPath + ' exists.' | ||||||
throw new Error(errorMessage); | ||||||
} | ||||||
|
||||||
// analyze should not have any side-effects e.g. calling `job.emitFile` | ||||||
// directly as this will not be included in the cachedAnalysis and won't | ||||||
// be emit for successive runs that leverage the cache | ||||||
analyzeResult = await analyze(path, source.toString(), this); | ||||||
this.analysisCache.set(path, analyzeResult); | ||||||
this.analysisCache.set(rawPath, analyzeResult); | ||||||
} | ||||||
|
||||||
const { deps, imports, assets, isESM } = analyzeResult; | ||||||
|
@@ -393,12 +449,12 @@ export class Job { | |||||
const ext = extname(asset); | ||||||
if (ext === '.js' || ext === '.mjs' || ext === '.node' || ext === '' || | ||||||
this.ts && (ext === '.ts' || ext === '.tsx') && asset.startsWith(this.base) && asset.slice(this.base.length).indexOf(sep + 'node_modules' + sep) === -1) | ||||||
await this.emitDependency(asset, path); | ||||||
await this.analyzeAndEmitDependency(asset, path, !isESM); | ||||||
else | ||||||
await this.emitFile(asset, 'asset', path); | ||||||
}), | ||||||
...[...deps].map(async dep => this.maybeEmitDep(dep, path, !isESM)), | ||||||
...[...imports].map(async dep => this.maybeEmitDep(dep, path, false)), | ||||||
...[...deps].map(async dep => this.maybeEmitDep(dep, rawPath, !isESM)), | ||||||
...[...imports].map(async dep => this.maybeEmitDep(dep, rawPath, false)), | ||||||
]); | ||||||
} | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
// Test that CJS files treat question marks in filenames as any other character, | ||
// matching Node behavior | ||
|
||
// https://www.youtube.com/watch?v=2ve20PVNZ18 | ||
|
||
const baseball = require('./who?what?idk!'); | ||
console.log(baseball.players); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
module.exports = { | ||
players: { | ||
first: 'Who', | ||
second: 'What', | ||
third: "I Don't Know", | ||
left: 'Why', | ||
center: 'Because', | ||
pitcher: 'Tomorrow', | ||
catcher: 'Today', | ||
shortstop: "I Don't Give a Damn!", | ||
}, | ||
}; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
[ | ||
"package.json", | ||
"test/unit/cjs-querystring/input.js", | ||
"test/unit/cjs-querystring/who?what?idk!.js" | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import { numSpecies } from "./bear.mjs?beaver?bison"; | ||
console.log(`There are ${numSpecies} species of bears.`); | ||
|
||
export const food = "termites"; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import * as cheetah from "./cheetah.mjs?cow=chipmunk"; | ||
console.log(`Cheetahs can run ${cheetah.topSpeed} mph.`); | ||
|
||
export const numSpecies = 8; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
export const topSpeed = 65; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
// Test that querystrings of various forms get stripped from esm imports when those | ||
// imports contain the `.mjs` file extension | ||
|
||
import * as aardvark from "./animalFacts/aardvark.mjs?anteater"; | ||
|
||
console.log(`Aardvarks eat ${aardvark.food}.`); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[ | ||
"test/unit/esm-querystring-mjs/animalFacts/aardvark.mjs", | ||
"test/unit/esm-querystring-mjs/animalFacts/bear.mjs", | ||
"test/unit/esm-querystring-mjs/animalFacts/cheetah.mjs", | ||
"test/unit/esm-querystring-mjs/input.js", | ||
"test/unit/esm-querystring-mjs/package.json" | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"private": true, | ||
"type": "module" | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import { numSpecies } from './bear?beaver?bison'; | ||
console.log(`There are ${numSpecies} species of bears.`); | ||
|
||
export const food = 'termites'; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import * as cheetah from './cheetah?cow=chipmunk'; | ||
console.log(`Cheetahs can run ${cheetah.topSpeed} mph.`); | ||
|
||
export const numSpecies = 8; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
export const topSpeed = 65; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
// Test that querystrings of various forms get stripped from esm imports | ||
|
||
import * as aardvark from './animalFacts/aardvark?anteater'; | ||
|
||
console.log(`Aardvarks eat ${aardvark.food}.`); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[ | ||
"test/unit/esm-querystring/animalFacts/aardvark.js", | ||
"test/unit/esm-querystring/animalFacts/bear.js", | ||
"test/unit/esm-querystring/animalFacts/cheetah.js", | ||
"test/unit/esm-querystring/input.js", | ||
"test/unit/esm-querystring/package.json" | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"private": true, | ||
"type": "module" | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
import * as dep from './dep'; | ||
|
||
export const dogs = ['Charlie', 'Maisey']; | ||
export const cats = ['Piper']; | ||
export const rats = dep.rats; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
export const rats = ['Debra']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just learned
url.parse()
was deprecated (again) so lets usenew URL()
insteadThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestion on how to deal with URL not being able to handle relative paths? Would it be ok for us to use some random base url (eg
http://example.com
,file://
, orplaceholder-protocol://
)?Implementation based on best intentions: 44f3d3f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a job for
pathToFileURL()
or maybefileURLToPath()