Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: detect ESM syntax in ambiguous JavaScript #50096

Merged
merged 8 commits into from Oct 20, 2023

Conversation

GeoffreyBooth
Copy link
Member

@GeoffreyBooth GeoffreyBooth commented Oct 9, 2023

Implements #50064. This PR aims to allow Node.js to allow ES module syntax input by default, without the user needing to opt in by using a non-.js extension, a package.json "type" field or a command-line flag. The goal is to make this enabled by default, unflagged, once we consider it stable.

For the following “ambiguous” inputs:

  • Files with a .js extension or no extension; and either no controlling package.json or one that lacks a type field; and --experimental-default-type is not specified
  • String input (--eval or STDIN) when neither --input-type nor --experimental-default-type are specified

Node will do the following:

  • For ambiguous main entry point, either file or string input, the new containsModuleSyntax function from esm: improve check for ESM syntax #50127 is used to determine whether the CommonJS or ESM loader should handle the main entry. (The ESM loader is used if the main entry contains module syntax.)
  • When the ESM loader is handling modules, within the load hook, once the source is loaded it will be passed to containsModuleSyntax to determine whether the format should be module or commonjs (again, only for the ambiguous inputs specified above).

This behavior applies to ambiguous files within node_modules. This is to prevent the hazard where a package author writes a library relying on this detection behavior but consumers of the package would fail to load it if they didn’t also have the detection behavior. (Such a hazard will still exist for consumers running old versions of Node, but at least it’s avoided moving forward.)

This behavior applies to the main entry point and to files referenced via import. It would not apply to files referenced via require, because require cannot load ES modules.

This behavior does not apply to modules with any marker of explicitness: an .mjs or .cjs extension, a package.json with a type field, specified --input-type or --experimental-default-type flags.

This behavior does not apply to --print input, as --print does not support ESM syntax.

When enabled, detection should have no impact on all-CommonJS apps (either written that way or transpiled into CommonJS) since detection isn’t run on files that are referenced via require; such apps would incur only a single double parse for the entry point (and therefore never opt into the ESM loader). The largest impact would be for ESM apps with CommonJS dependencies that lack type fields.

Before unflagging this we need to confirm that when enabled, this doesn’t affect any apps that run without erroring today. Or inversely it should have no effect on anything that already runs successfully, and therefore is not a breaking change.

cc @nodejs/performance @nodejs/loaders @nodejs/tsc

Notable change

The new flag --experimental-detect-module can be used to automatically run ES modules when their syntax can be detected. For “ambiguous” files, which are .js or extensionless files with no package.json with a type field, Node.js will parse the file to detect ES module syntax; if found, it will run the file as an ES module, otherwise it will run the file as a CommonJS module. The same applies to string input via --eval or STDIN.

We hope to make detection enabled by default in a future version of Node.js. Detection increases startup time, so we encourage everyone—especially package authors—to add a type field to package.json, even for the default "type": "commonjs". The presence of a type field, or explicit extensions such as .mjs or .cjs, will opt out of detection.

@GeoffreyBooth GeoffreyBooth added module Issues and PRs related to the module subsystem. cli Issues and PRs related to the Node.js command line interface. esm Issues and PRs related to the ECMAScript Modules implementation. labels Oct 9, 2023
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Oct 9, 2023
lib/internal/modules/esm/translators.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/translators.js Outdated Show resolved Hide resolved
lib/internal/modules/helpers.js Outdated Show resolved Hide resolved
doc/api/esm.md Outdated Show resolved Hide resolved
@GeoffreyBooth
Copy link
Member Author

There are tests failing because of the change in this PR where I move the attempted CommonJS module load out of the ModuleWrap callback to just before that callback is defined, because I need it earlier so that we can retry as ESM. (https://github.com/nodejs/node/pull/50096/files#diff-b1de5e9ce1e9b411001c63d11a5092dbba32e1586269a20dc11408375eb2f16cR309-R318.) The failing tests are ones that compare error messages and stacks, and I suspect that some of these are legitimate errors because of some special handling that happens to errors thrown within the ModuleWrap callback. If someone who knows C++ and/or ModuleWrap can help me figure out how to preserve the prior error handling behavior, I could use the assistance.

@mcollina
Copy link
Member

mcollina commented Oct 9, 2023

I think @jasnell did some analysis on related topics using perf_hooks.

@GeoffreyBooth
Copy link
Member Author

Looking at the first broken test, it’s for the debugger pausing on caught exceptions. The PR as currently written puts the “module load” within a try/catch, where inside the catch I check if we should do the “try again as ESM” logic, else rethrow the error. This new try/catch changes the error that the “debugger catches caught exceptions” test is asserting against.

To fix the test, I can move the “should we retry” condition outside of the try/catch, so it’s like “if we’re in detect/retry mode, try/catch and retry; else just module load.” But the issue there is that the test will pass now while there’s a flag, but once the flag is removed and we’re in detect/retry mode by default, I’ll be back to the broken test so long as the test case remains ambiguous code.

Which leads to the bigger question, is that should Node adding a try/catch around module loading be considered a breaking change? If not, then I can just update the test accordingly and move on. I have a feeling that many of the other broken tests will be things like this, where the tests are checking very specific things in Node’s internals (like particular stack traces, or async resources created) that change in this branch, even for CommonJS code that executes successfully. We can always mark this as semver-major, and go into detail in the release notes of the edge cases that are affected; most real-world code should run unaffected. Or if the things that change as a result of this are internals that we don’t consider part of the public API, then the changes shouldn’t be considered breaking.

lib/internal/modules/esm/translators.js Outdated Show resolved Hide resolved
lib/internal/modules/run_main.js Outdated Show resolved Hide resolved
@GeoffreyBooth GeoffreyBooth force-pushed the ambiguous-detection branch 3 times, most recently from 34f7a35 to 6c1c2f5 Compare October 10, 2023 06:07
@targos
Copy link
Member

targos commented Oct 10, 2023

I'd like to see some tests that verify we don't catch runtime exceptions. For example, this should throw in CommonJS and not trigger the ESM re-evaluation.

eval('import "something";');

@GeoffreyBooth
Copy link
Member Author

I’m going to redo this to use #50127 once that lands.

@GeoffreyBooth
Copy link
Member Author

I’ve updated this to build on #50127. You can see the differences that this PR adds at https://github.com/GeoffreyBooth/node/compare/better-esm-check…GeoffreyBooth:node:ambiguous-detection

@GeoffreyBooth GeoffreyBooth marked this pull request as ready for review October 15, 2023 02:36
@GeoffreyBooth GeoffreyBooth force-pushed the ambiguous-detection branch 3 times, most recently from c086c61 to c3ed61c Compare October 17, 2023 03:08
@karlhorky
Copy link

Thanks for this @GeoffreyBooth and everyone else involved! 🙌 Amazing to see these pieces being developed enabling the move to ESM default 👍

@karlhorky
Copy link

karlhorky commented Nov 1, 2023

If anyone wants to play with this without installing Node.js v21.1.0 on your machine, I created a sandbox here that demonstrates the detection of the entry point:

CodeSandbox Demo: https://codesandbox.io/p/sandbox/restless-leaf-rhrxhp?file=%2F.devcontainer%2FDockerfile%3A2%2C1

Screenshot 2023-11-01 at 14 44 03

alexfernandez pushed a commit to alexfernandez/node that referenced this pull request Nov 1, 2023
PR-URL: nodejs#50096
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Guy Bedford <guybedford@gmail.com>
alexfernandez pushed a commit to alexfernandez/node that referenced this pull request Nov 1, 2023
Notable changes:

doc:
  * add H4ad to collaborators (Vinícius Lourenço) nodejs#50217
esm:
  * (SEMVER-MINOR) detect ESM syntax in ambiguous JavaScript (Geoffrey Booth) nodejs#50096
fs:
  * (SEMVER-MINOR) add flush option to appendFile() functions (Colin Ihrig) nodejs#50095
lib:
  * (SEMVER-MINOR) add `navigator.userAgent` (Yagiz Nizipli) nodejs#50200
stream:
  * (SEMVER-MINOR) allow pass stream class to `stream.compose` (Alex Yang) nodejs#50187
  * call helper function from push and unshift (Raz Luvaton) nodejs#50173

PR-URL: nodejs#50335
@privatenumber
Copy link
Contributor

Love this feature!

I attempted to leverage this in tsx to replace the CommonJS loader. Unfortunately, I wasn't successful because this feature requires strict ESM (no references to require, __filename, etc.), and some outdated packages may use a mix of ESM/CJS.

This behavior and constraint makes sense to me, especially for Node—just wanted to share a limitation I observed.

targos pushed a commit that referenced this pull request Nov 11, 2023
PR-URL: #50096
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Guy Bedford <guybedford@gmail.com>
targos added a commit that referenced this pull request Nov 12, 2023
Notable changes:

deps:
  * (SEMVER-MINOR) update uvwasi to 0.0.19 (Node.js GitHub Bot) #49908
doc:
  * add H4ad to collaborators (Vinícius Lourenço) #50217
esm:
  * (SEMVER-MINOR) detect ESM syntax in ambiguous JavaScript (Geoffrey Booth) #50096
  * use import attributes instead of import assertions (Antoine du Hamel) #50140
  * --experimental-default-type flag to flip module defaults (Geoffrey Booth) #49869
fs:
  * (SEMVER-MINOR) add flush option to appendFile() functions (Colin Ihrig) #50095
  * add flush option to writeFile() functions (Colin Ihrig) #50009
lib:
  * (SEMVER-MINOR) add WebSocket client (Matthew Aitken) #49830
stream:
  * (SEMVER-MINOR) allow pass stream class to `stream.compose` (Alex Yang) #50187
  * call helper function from push and unshift (Raz Luvaton) #50173
  * optimize Writable (Robert Nagy) #50012
test_runner, cli:
  * (SEMVER-MINOR) add --test-concurrency flag (Colin Ihrig) #49996
vm:
  * (SEMVER-MINOR) use import attributes instead of import assertions (Antoine du Hamel) #50141
  * use default HDO when importModuleDynamically is not set (Joyee Cheung) #49950
wasi:

PR-URL: #50682
targos added a commit that referenced this pull request Nov 12, 2023
Notable changes:

deps:
  * (SEMVER-MINOR) update uvwasi to 0.0.19 (Node.js GitHub Bot) #49908
doc:
  * add H4ad to collaborators (Vinícius Lourenço) #50217
esm:
  * (SEMVER-MINOR) detect ESM syntax in ambiguous JavaScript (Geoffrey Booth) #50096
  * use import attributes instead of import assertions (Antoine du Hamel) #50140
  * --experimental-default-type flag to flip module defaults (Geoffrey Booth) #49869
fs:
  * (SEMVER-MINOR) add flush option to appendFile() functions (Colin Ihrig) #50095
  * add flush option to writeFile() functions (Colin Ihrig) #50009
lib:
  * (SEMVER-MINOR) add WebSocket client (Matthew Aitken) #49830
stream:
  * (SEMVER-MINOR) allow pass stream class to `stream.compose` (Alex Yang) #50187
  * call helper function from push and unshift (Raz Luvaton) #50173
  * optimize Writable (Robert Nagy) #50012
test_runner, cli:
  * (SEMVER-MINOR) add --test-concurrency flag (Colin Ihrig) #49996
vm:
  * (SEMVER-MINOR) use import attributes instead of import assertions (Antoine du Hamel) #50141
  * use default HDO when importModuleDynamically is not set (Joyee Cheung) #49950
wasi:

PR-URL: #50682
@wraithgar
Copy link

wraithgar commented Nov 17, 2023

What do we do with all of the packages that exist today? They're idempotent and can't change.

I would humbly suggest this is a mistake. If I were to pick the number one lesson learned in the last 14 years of npm's history it would be: "Do not try to guess user intent". You have a clear way for folks to indicate module type, and over a decade of history for what that type is if this new feature is not being indicated. Why are we slowing down every package that exists for an implementation of guessing what the user wants?

@GeoffreyBooth
Copy link
Member Author

@wraithgar That was considered; see the referenced issues. There are use cases that just can't be solved, such as ESM syntax in loose extensionless files, without either a breaking change or detection. Adding detection is the least bad option. We spent months considering other options.

It's been a strong recommendation for years in the Node docs that every package author include the type field. Once this feature becomes enabled by default, such packages will still work, they'll just increase the overall app startup time slightly. Runtime performance is unaffected. We felt that this is an acceptable trade-off for finally supporting the remaining use cases and to support ESM syntax by default without requiring an opt in.

If you'd like to discuss further please open a new issue.

@ljharb
Copy link
Member

ljharb commented Nov 17, 2023

I've proposed a solution multiple times (including during the modules WG discussions) that doesn't require either one of those, but you keep ignoring it. An extensions map would be a superior nonbreaking replacement for type that would cover extensionless files.

targos added a commit that referenced this pull request Nov 21, 2023
Notable changes:

deps:
  * (SEMVER-MINOR) update uvwasi to 0.0.19 (Node.js GitHub Bot) #49908
doc:
  * add H4ad to collaborators (Vinícius Lourenço) #50217
esm:
  * (SEMVER-MINOR) detect ESM syntax in ambiguous JavaScript (Geoffrey Booth) #50096
  * use import attributes instead of import assertions (Antoine du Hamel) #50140
  * --experimental-default-type flag to flip module defaults (Geoffrey Booth) #49869
fs:
  * (SEMVER-MINOR) add flush option to appendFile() functions (Colin Ihrig) #50095
  * add flush option to writeFile() functions (Colin Ihrig) #50009
lib:
  * (SEMVER-MINOR) add WebSocket client (Matthew Aitken) #49830
stream:
  * (SEMVER-MINOR) allow pass stream class to `stream.compose` (Alex Yang) #50187
  * call helper function from push and unshift (Raz Luvaton) #50173
  * optimize Writable (Robert Nagy) #50012
test_runner, cli:
  * (SEMVER-MINOR) add --test-concurrency flag (Colin Ihrig) #49996
vm:
  * (SEMVER-MINOR) use import attributes instead of import assertions (Antoine du Hamel) #50141
  * use default HDO when importModuleDynamically is not set (Joyee Cheung) #49950
wasi:

PR-URL: #50682
targos added a commit that referenced this pull request Nov 22, 2023
Notable changes:

deps:
  * (SEMVER-MINOR) update uvwasi to 0.0.19 (Node.js GitHub Bot) #49908
doc:
  * add H4ad to collaborators (Vinícius Lourenço) #50217
esm:
  * (SEMVER-MINOR) detect ESM syntax in ambiguous JavaScript (Geoffrey Booth) #50096
  * use import attributes instead of import assertions (Antoine du Hamel) #50140
  * --experimental-default-type flag to flip module defaults (Geoffrey Booth) #49869
fs:
  * (SEMVER-MINOR) add flush option to appendFile() functions (Colin Ihrig) #50095
  * add flush option to writeFile() functions (Colin Ihrig) #50009
lib:
  * (SEMVER-MINOR) add WebSocket client (Matthew Aitken) #49830
stream:
  * (SEMVER-MINOR) allow pass stream class to `stream.compose` (Alex Yang) #50187
  * call helper function from push and unshift (Raz Luvaton) #50173
  * optimize Writable (Robert Nagy) #50012
test_runner, cli:
  * (SEMVER-MINOR) add --test-concurrency flag (Colin Ihrig) #49996
vm:
  * (SEMVER-MINOR) use import attributes instead of import assertions (Antoine du Hamel) #50141
  * use default HDO when importModuleDynamically is not set (Joyee Cheung) #49950
wasi:

PR-URL: #50682
martenrichter pushed a commit to martenrichter/node that referenced this pull request Nov 26, 2023
Notable changes:

deps:
  * (SEMVER-MINOR) update uvwasi to 0.0.19 (Node.js GitHub Bot) nodejs#49908
doc:
  * add H4ad to collaborators (Vinícius Lourenço) nodejs#50217
esm:
  * (SEMVER-MINOR) detect ESM syntax in ambiguous JavaScript (Geoffrey Booth) nodejs#50096
  * use import attributes instead of import assertions (Antoine du Hamel) nodejs#50140
  * --experimental-default-type flag to flip module defaults (Geoffrey Booth) nodejs#49869
fs:
  * (SEMVER-MINOR) add flush option to appendFile() functions (Colin Ihrig) nodejs#50095
  * add flush option to writeFile() functions (Colin Ihrig) nodejs#50009
lib:
  * (SEMVER-MINOR) add WebSocket client (Matthew Aitken) nodejs#49830
stream:
  * (SEMVER-MINOR) allow pass stream class to `stream.compose` (Alex Yang) nodejs#50187
  * call helper function from push and unshift (Raz Luvaton) nodejs#50173
  * optimize Writable (Robert Nagy) nodejs#50012
test_runner, cli:
  * (SEMVER-MINOR) add --test-concurrency flag (Colin Ihrig) nodejs#49996
vm:
  * (SEMVER-MINOR) use import attributes instead of import assertions (Antoine du Hamel) nodejs#50141
  * use default HDO when importModuleDynamically is not set (Joyee Cheung) nodejs#49950
wasi:

PR-URL: nodejs#50682
lucshi pushed a commit to lucshi/node that referenced this pull request Nov 27, 2023
Notable changes:

deps:
  * (SEMVER-MINOR) update uvwasi to 0.0.19 (Node.js GitHub Bot) nodejs#49908
doc:
  * add H4ad to collaborators (Vinícius Lourenço) nodejs#50217
esm:
  * (SEMVER-MINOR) detect ESM syntax in ambiguous JavaScript (Geoffrey Booth) nodejs#50096
  * use import attributes instead of import assertions (Antoine du Hamel) nodejs#50140
  * --experimental-default-type flag to flip module defaults (Geoffrey Booth) nodejs#49869
fs:
  * (SEMVER-MINOR) add flush option to appendFile() functions (Colin Ihrig) nodejs#50095
  * add flush option to writeFile() functions (Colin Ihrig) nodejs#50009
lib:
  * (SEMVER-MINOR) add WebSocket client (Matthew Aitken) nodejs#49830
stream:
  * (SEMVER-MINOR) allow pass stream class to `stream.compose` (Alex Yang) nodejs#50187
  * call helper function from push and unshift (Raz Luvaton) nodejs#50173
  * optimize Writable (Robert Nagy) nodejs#50012
test_runner, cli:
  * (SEMVER-MINOR) add --test-concurrency flag (Colin Ihrig) nodejs#49996
vm:
  * (SEMVER-MINOR) use import attributes instead of import assertions (Antoine du Hamel) nodejs#50141
  * use default HDO when importModuleDynamically is not set (Joyee Cheung) nodejs#49950
wasi:

PR-URL: nodejs#50682
@andrewbranch
Copy link

This will have some unfortunate implications for TypeScript if it becomes enabled by default in the future. I wrote up an explanation on our issue tracker: microsoft/TypeScript#56678

ckerr added a commit to electron/electron that referenced this pull request Dec 7, 2023
Xref: nodejs/node#50096
Xref: nodejs/node#50314
in lib/internal/modules/esm/load.js, update the code that checks for
`format === 'electron'`. I'd like 👀 on this

Xref: nodejs/node#49657
add braces in lib/internal/modules/esm/translators.js to sync with upstream
codebytere pushed a commit to electron/electron that referenced this pull request Dec 8, 2023
Xref: nodejs/node#50096
Xref: nodejs/node#50314
in lib/internal/modules/esm/load.js, update the code that checks for
`format === 'electron'`. I'd like 👀 on this

Xref: nodejs/node#49657
add braces in lib/internal/modules/esm/translators.js to sync with upstream
codebytere added a commit to electron/electron that referenced this pull request Dec 8, 2023
codebytere added a commit to electron/electron that referenced this pull request Dec 10, 2023
codebytere added a commit to electron/electron that referenced this pull request Dec 11, 2023
* chore: bump node in DEPS to v20.10.0

* chore: update feat_initialize_asar_support.patch

no code changes; patch just needed an update due to nearby upstream changes

Xref: nodejs/node#49986

* chore: update pass_all_globals_through_require.patch

no manual changes; patch applied with fuzz

Xref: nodejs/node#49657

* chore: update refactor_allow_embedder_overriding_of_internal_fs_calls

Xref: nodejs/node#49912

no code changes; patch just needed an update due to nearby upstream changes

* chore: update chore_allow_the_node_entrypoint_to_be_a_builtin_module.patch

Xref: nodejs/node#49986

minor manual changes needed to sync with upstream change

* update fix_expose_the_built-in_electron_module_via_the_esm_loader.patch

Xref: nodejs/node#50096
Xref: nodejs/node#50314
in lib/internal/modules/esm/load.js, update the code that checks for
`format === 'electron'`. I'd like 👀 on this

Xref: nodejs/node#49657
add braces in lib/internal/modules/esm/translators.js to sync with upstream

* fix: lazyload fs in esm loaders to apply asar patches

* nodejs/node#50127
* nodejs/node#50096

* esm: jsdoc for modules code

nodejs/node#49523

* test: set test-cli-node-options as flaky

nodejs/node#50296

* deps: update c-ares to 1.20.1

nodejs/node#50082

* esm: bypass CommonJS loader under --default-type=module

nodejs/node#49986

* deps: update uvwasi to 0.0.19

nodejs/node#49908

* lib,test: do not hardcode Buffer.kMaxLength

nodejs/node#49876

* crypto: account for disabled SharedArrayBuffer

nodejs/node#50034

* test: fix edge snapshot stack traces

nodejs/node#49659

* src: generate snapshot with --predictable

nodejs/node#48749

* chore: fixup patch indices

* fs: throw errors from sync branches instead of separate implementations

nodejs/node#49913

* crypto: ensure valid point on elliptic curve in SubtleCrypto.importKey

nodejs/node#50234

* esm: detect ESM syntax in ambiguous JavaScrip

nodejs/node#50096

* fixup! test: fix edge snapshot stack traces

* esm: unflag extensionless ES module JavaScript and Wasm in module scope

nodejs/node#49974

* [tagged-ptr] Arrowify objects

https://chromium-review.googlesource.com/c/v8/v8/+/4705331

---------

Co-authored-by: electron-roller[bot] <84116207+electron-roller[bot]@users.noreply.github.com>
Co-authored-by: Charles Kerr <charles@charleskerr.com>
Co-authored-by: Shelley Vohr <shelley.vohr@gmail.com>
@elovin elovin mentioned this pull request Mar 29, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. cli Issues and PRs related to the Node.js command line interface. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. esm Issues and PRs related to the ECMAScript Modules implementation. lib / src Issues and PRs related to general changes in the lib or src directory. module Issues and PRs related to the module subsystem. needs-ci PRs that need a full CI run. notable-change PRs with changes that should be highlighted in changelogs. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet