Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: do not call getSource when format is commonjs #50465

Merged
merged 6 commits into from Nov 29, 2023

Conversation

fasttime
Copy link
Contributor

@fasttime fasttime commented Oct 29, 2023

This PR ensures that defaultLoad does not access the file system to read the source of modules that are known in advance to be in CommonJS format. This is not necessary, since the CommonJS loader can't use the source property and must instead read the file once again.

For context, this was noticed while investigating a test failure in ESLint on Node.js 21.1.0. Our logic (now fixed) was incorrectly assuming that some asynchronous operations would complete in a specific order, and this order has changed from Node.js 21.0.0 to 21.1.0.
The simplest repro I've found is this one:

// index.mjs
await import('./file1.cjs');
process.nextTick(() => console.log('Tick'));
await import('./file2.cjs');
console.log('Done');

with file1.cjs and file2.cjs both empty modules.

This is the output of running index.mjs in Node.js 21.0.0:

Done
Tick

And this is the output in Node.js 21.1.0:

Tick
Done

Here is another similar repro in a single file:

import { mkdtemp, writeFile }   from 'node:fs/promises';
import { join }                 from 'node:path';
import { pathToFileURL }        from 'node:url';

const dir = await mkdtemp('test');
const file1 = join(dir, 'file1.cjs');
const file2 = join(dir, 'file2.cjs');
await Promise.all([writeFile(file1, ''), await writeFile(file2, '')]);

await import(pathToFileURL(file1));
process.nextTick(() => console.log('Tick'));
await import(pathToFileURL(file2));
console.log('Done');

The reason for the inverted order is this commit and a later change, which is causing getSource to be called even when the format of a module is already known to be commonjs, for example because of the .cjs extension.
getSource accesses the file system and that consistently requires one or more I/O cycles to complete. This allows callbacks scheduled with process.nextTick to run in the meantime.

Clearly, importing a CommonJS module is an asynchronous operation which souldn't be assumed to resolve in a specific order or number of phases, but calling getSource for known CommonJS modules isn't needed and should be avoided to not occupy the event loop unnecessarily.

Refs: eslint/eslint#17683

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders

@nodejs-github-bot nodejs-github-bot added esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. labels Oct 29, 2023
@fasttime fasttime marked this pull request as ready for review October 29, 2023 10:27
@ljharb
Copy link
Member

ljharb commented Oct 29, 2023

If the observable output is changing then that’s a breaking change, so it’s pretty important to fix.

Copy link
Contributor

@aduh95 aduh95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A few comments regarding the tests:

test/es-module/test-esm-dynamic-import-commonjs.js Outdated Show resolved Hide resolved
test/es-module/test-esm-dynamic-import-commonjs.js Outdated Show resolved Hide resolved
test/es-module/test-esm-dynamic-import-commonjs.mjs Outdated Show resolved Hide resolved
test/es-module/test-esm-dynamic-import-commonjs.mjs Outdated Show resolved Hide resolved
@GeoffreyBooth
Copy link
Member

I feel like the bug will just resurface whenever custom hooks are present and returning a source for a CommonJS file. At the very least this PR should add a test to prove that my worry is unfounded; but perhaps a better solution is to fetch the source within the ESM loader like we do for all other module types and the CJS loader somehow uses the already-retrieved source rather than attempt to load it again.

@fasttime
Copy link
Contributor Author

fasttime commented Nov 1, 2023

perhaps a better solution is to fetch the source within the ESM loader like we do for all other module types and the CJS loader somehow uses the already-retrieved source rather than attempt to load it again.

Thanks @GeoffreyBooth, I've created a new branch with the change you suggested. I had to use a new internal property realSource to store the source fetched by the ESM loader in the result. If this property is set by a custom loader, it will be discarded along with other unknown properties, meaning that the public API isn't altered.

I like this change because it replaces a synchronous read operation with an asynchronous one. The downside is that there would be a discrepancy between defaultLoad and (existing) custom loader implementations. There seems to be also a concern that the observable behavior [in versions prior to 21.1.0] shouldn't change, so I can't decide which solution is better.

@aduh95
Copy link
Contributor

aduh95 commented Nov 1, 2023

We cannot use the source from the ESM loader, it would short cut all the monkey patching of the CJS loader.

@fasttime
Copy link
Contributor Author

fasttime commented Nov 1, 2023

We cannot use the source from the ESM loader, it would short cut all the monkey patching of the CJS loader.

It doesn't if source stays null, that's why I had to use a new property.

@aduh95
Copy link
Contributor

aduh95 commented Nov 1, 2023

I think the actual fix would be to ask the main thread CJS loader for the source rather than calling getSource from the loader thread.

@aduh95
Copy link
Contributor

aduh95 commented Nov 1, 2023

That's a bummer because we are looking for ways to remove dependency on the monkey patchable CJS loader, but oh well. Or we accept the breaking change, and ask folks who rely on the monkey patchability of the CJS loader to switch to the new API (which mean we can't unflag until the ecosystem is ready)

@GeoffreyBooth
Copy link
Member

ask folks who rely on the monkey patchability of the CJS loader

Because monkey patching is unsupported, there's no defined API surface. We don't really know what would or wouldn't break people.

In this case, I think the only impact would be people who monkey patch the CJS loader in order to affect import of CJS? Which is no one? I think we should just refactor as is best for our codebase and not worry about theoretical breaks.

@aduh95
Copy link
Contributor

aduh95 commented Nov 1, 2023

I think the only impact would be people who monkey patch the CJS loader in order to affect import of CJS? Which is no one?

Which is everyone you mean? I think Yarn PnP relies on that for example, or any project that uses pirates hooks I think.

@GeoffreyBooth
Copy link
Member

No, that's relying on monkey patching to affect require, not import.

@aduh95
Copy link
Contributor

aduh95 commented Nov 1, 2023

No, that's relying on monkey patching to affect require, not import.

I don’t know what makes you think that, I think that’s not correct.

@GeoffreyBooth
Copy link
Member

I don’t know what makes you think that, I think that’s not correct.

Can you provide an example where someone is monkey-patching the CommonJS loader to affect import of CommonJS?

Regardless, I don’t think it’s a supported use case. I don’t think monkey-patching at all is a supported use case, and monkey-patching the CommonJS loader to affect only CommonJS files in a mixed CommonJS-ESM codebase is even less supported: we provide the module hooks for that.

@aduh95
Copy link
Contributor

aduh95 commented Nov 1, 2023

Can you provide an example where someone is monkey-patching the CommonJS loader to affect import of CommonJS?

Do you mean examples other than the one I gave in #50465 (comment)? Could you clarify why you think those are not good examples?

we provide the module hooks for that

not really, if you provide a source with module hooks for commonjs, you have to use faux CJS, which not everyone can switch to overnight (e.g. a large portion of the ecosystem relies on require.cache to be a thing).

@ljharb
Copy link
Member

ljharb commented Nov 1, 2023

(nb, also require.extensions)

@fasttime
Copy link
Contributor Author

fasttime commented Nov 6, 2023

Since this PR is all about avoiding to read the same file twice with the ESM loader, maybe it would be better to continue the discussion about the CommonJS monkey-patching legacy in a separate issue to give it more visibility and solicit other perspectives. It looks like there is more need to speak about that.

Meanwhile, I'm happy to update and extend the unit tests as suggested so we can move on with this issue.

@fasttime
Copy link
Contributor Author

fasttime commented Nov 7, 2023

I feel like the bug will just resurface whenever custom hooks are present and returning a source for a CommonJS file. At the very least this PR should add a test to prove that my worry is unfounded

@GeoffreyBooth I've added a unit test to ensure that the file specified in the URL is not used when a source is already present. I'm saying used because the actual read access is wrapped in a try-catch statement, so even if a read attempt occurred, it would not be possible to detect that situation by simulating a read error. Still, the ??= operator guarantees that the when fs.readFile is called, the result is stored in source, so checking the source should be enough to exclude a read operation. Let me know if you have a better idea. Here is the new test: c93377f. Please, have a look again.

@fasttime fasttime requested a review from aduh95 November 7, 2023 13:14
import assert from 'assert';

const { default: existingFileSource } = await import(fixtures.fileURL('es-modules', 'cjs-file.cjs'));
const { default: noSuchFileSource } = await import(fixtures.fileURL('no-such-file.cjs'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since this does not in fact reference a fixtures, let's simplify.

Suggested change
const { default: noSuchFileSource } = await import(fixtures.fileURL('no-such-file.cjs'));
const { default: noSuchFileSource } = await import('./no-such-file.cjs');

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Unfortunately, './no-such-file.cjs' wouldn't work because it's not a valid URL, and the resolve hook doesn't actually resolve relative paths because of the short-circuit. Anyway, a file URL like 'file:///no-such-file.cjs' works fine and is not relative (at least on Unix), so I switched to use that.

Copy link
Contributor

@aduh95 aduh95 Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's incorrect to say it's not a valid URL, it is a valid (relative) URL.

Anyway, a file URL like 'file:///no-such-file.cjs' works fine

IMO it would make more sense to use a relative URL in the JS, and have the hook return a specific URL. That doesn't really matter anyway, for the purpose of this test the current version works, so good enough for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I had to change the test again because Windows doesn't like the absolute file URL without a drive letter: 4dcbec5. Let's see how the CI build goes this time.

@aduh95 aduh95 added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels Nov 8, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Nov 8, 2023
@nodejs-github-bot
Copy link
Collaborator

@aduh95 aduh95 added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Nov 29, 2023
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Nov 29, 2023
@nodejs-github-bot nodejs-github-bot merged commit 2602c46 into nodejs:main Nov 29, 2023
56 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 2602c46

RafaelGSS pushed a commit that referenced this pull request Nov 29, 2023
Ensure that `defaultLoad` does not uselessly access the file system to
get the source of modules that are known to be in CommonJS format.

This allows CommonJS imports to resolve in the current phase of the
event loop.

Refs: eslint/eslint#17683
PR-URL: #50465
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
@RafaelGSS RafaelGSS mentioned this pull request Nov 29, 2023
UlisesGascon pushed a commit that referenced this pull request Dec 11, 2023
Ensure that `defaultLoad` does not uselessly access the file system to
get the source of modules that are known to be in CommonJS format.

This allows CommonJS imports to resolve in the current phase of the
event loop.

Refs: eslint/eslint#17683
PR-URL: #50465
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
@UlisesGascon UlisesGascon mentioned this pull request Dec 12, 2023
UlisesGascon pushed a commit that referenced this pull request Dec 13, 2023
Ensure that `defaultLoad` does not uselessly access the file system to
get the source of modules that are known to be in CommonJS format.

This allows CommonJS imports to resolve in the current phase of the
event loop.

Refs: eslint/eslint#17683
PR-URL: #50465
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
UlisesGascon pushed a commit that referenced this pull request Dec 15, 2023
Ensure that `defaultLoad` does not uselessly access the file system to
get the source of modules that are known to be in CommonJS format.

This allows CommonJS imports to resolve in the current phase of the
event loop.

Refs: eslint/eslint#17683
PR-URL: #50465
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
UlisesGascon pushed a commit that referenced this pull request Dec 19, 2023
Ensure that `defaultLoad` does not uselessly access the file system to
get the source of modules that are known to be in CommonJS format.

This allows CommonJS imports to resolve in the current phase of the
event loop.

Refs: eslint/eslint#17683
PR-URL: #50465
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
codebytere added a commit to electron/electron that referenced this pull request Jan 10, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 10, 2024
zcbenz pushed a commit to electron/electron that referenced this pull request Jan 12, 2024
zcbenz pushed a commit to electron/electron that referenced this pull request Jan 12, 2024
zcbenz pushed a commit to electron/electron that referenced this pull request Jan 12, 2024
zcbenz pushed a commit to electron/electron that referenced this pull request Jan 12, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 15, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 15, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 15, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 15, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 16, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 16, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 18, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 18, 2024
jkleinsc pushed a commit to electron/electron that referenced this pull request Jan 18, 2024
jkleinsc pushed a commit to electron/electron that referenced this pull request Jan 18, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 18, 2024
codebytere added a commit to electron/electron that referenced this pull request Jan 18, 2024
jkleinsc pushed a commit to electron/electron that referenced this pull request Jan 18, 2024
* chore: bump node in DEPS to v20.11.0

* module: bootstrap module loaders in shadow realm

nodejs/node#48655

* src: add commit hash shorthand in zlib version

nodejs/node#50158

* v8,tools: expose necessary V8 defines

nodejs/node#50820

* esm: do not call getSource when format is commonjs

nodejs/node#50465

* esm: fallback to readFileSync when source is nullish

nodejs/node#50825

* vm: allow dynamic import with a referrer realm

nodejs/node#50360

* test: skip test-diagnostics-channel-memory-leak.js

nodejs/node#50327

* esm: do not call getSource when format is commonjs

nodejs/node#50465

* lib: fix assert throwing different error messages in ESM and CJS

nodejs/node#50634

* src: fix compatility with upcoming V8 12.1 APIs

nodejs/node#50709

* deps: update base64 to 0.5.1

nodejs/node#50629

* src: avoid silent coercion to signed/unsigned int

nodejs/node#50663

* src: fix compatility with upcoming V8 12.1 APIs

nodejs/node#50709

* chore: fix patch indices

* chore: update patches

* test: disable TLS cipher test

This can't be enabled owing to BoringSSL incompatibilities.

nodejs/node#50186

* fix: check for Buffer and global definition in shadow realm

nodejs/node#51239

* test: disable parallel/test-shadow-realm-custom-loader

Incompatible with our asar logic, resulting in the following failure:

> Failed to CompileAndCall electron script: electron/js2c/asar_bundle

* chore: remove deleted parallel/test-crypto-modp1-error test

* test: make test-node-output-v8-warning generic

nodejs/node#50421

* chore: fixup ModuleWrap patch

* test: match wpt/streams/transferable/transform-stream-members.any.js to upstream

* fix: sandbox is not enabled on arm

* chore: disable v8 sandbox on ia32/arm

---------

Co-authored-by: electron-roller[bot] <84116207+electron-roller[bot]@users.noreply.github.com>
Co-authored-by: Shelley Vohr <shelley.vohr@gmail.com>
Co-authored-by: Cheng Zhao <zcbenz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants