Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CJS exports parsing reads directly from disk #50649

Open
privatenumber opened this issue Nov 10, 2023 · 11 comments
Open

CJS exports parsing reads directly from disk #50649

privatenumber opened this issue Nov 10, 2023 · 11 comments

Comments

@privatenumber
Copy link
Contributor

privatenumber commented Nov 10, 2023

Version

21.1.0

Platform

All

Subsystem

All

What steps will reproduce the bug?

I'm utilizing a CommonJS require hook to dynamically compile ESM syntax into CommonJS at runtime using esbuild.

Esbuild adds annotations to the end of the compiled code for Node to recognize named exports. Example.

When importing this "ESM in CJS context" file from a legitimate ESM file (.mjs), Node doesn't detect the named exports because it circumvents the loader and reads directly from disk. Reproduction using tsx

(I wasn't able to assemble a minimal reproduction since I think it will be pretty complicated. But happy to try to work on one if necessary.)

How often does it reproduce? Is there a required condition?

Utilizing a CommonJS require hook to compile ESM syntax in a .js file in a CommonJS context, then importing the file from a .mjs file

What is the expected behavior? Why is that the expected behavior?

For Node to parse the named exports from the source retrieved by the loader

What do you see instead?

Node should parse the named exports from the source retrieved by the loader.

Additional information

This limitation is acknowledged in the source code as a TODO here:
https://github.com/nodejs/node/blob/v21.2.0/lib/internal/modules/esm/translators.js#L421-L424

While it might be considered an edge case, I'm filing this issue to highlight its impact on tsx users:

@benjamingr
Copy link
Member

Any chance you can make a smaller reproduction that uses a loader/require hook directly instead of the tsx npm package? It would make this easier to diagnose

@privatenumber
Copy link
Contributor Author

Sure, I'll give it a shot over the weekend.

@privatenumber
Copy link
Contributor Author

Managed to reproduce it minimally:
https://stackblitz.com/edit/stackblitz-starters-j9tcmc?file=index.mjs

Although StackBlitz uses Node v18.18, you can download the project and still reproduce it with Node v21.1.0


The reproduction calls node --require ./require-hook.cjs index.mjs.

For the sake of the reproduction, require-hook.cjs completely overwrites the code of file.js. (In practice, tsx transforms the ESM export syntax to CJS exports)

But despite that, when importing file.js from index.mjs, the removed export is still present.

This is happening because Node's CJS exports parser bypasses the CJS loader and directly analyzes the disk file. Seems this bug is acknowledged here.

@privatenumber
Copy link
Contributor Author

@benjamingr Pinging you just in case you missed the weekend notification.

@aduh95
Copy link
Contributor

aduh95 commented Nov 19, 2023

For a lack of better source, Node.js uses the FS which is not always correct; but this feature was added as best effort, and it is documented that using the default export is the only reliable thing for CJS modules.

Ideally, you should be using the hook API instead of monkey-patching the CJS loader.

@arcanis
Copy link
Contributor

arcanis commented Nov 19, 2023

It however ties into nodejs/loaders#172, which postulates that Node.js shouldn't rely on an unextendable FS for such operations.

@privatenumber
Copy link
Contributor Author

@aduh95

this feature was added as best effort, and it is documented that using the default export is the only reliable thing for CJS modules.

I understand, thanks for clarifying. Nonetheless, the resulting ESM exports are incorrect (in the reproduction) so I think it's definitely worth acknowledging as a bug.

I originally saw this TODO over a year ago and I thought it would be addressed but it hadn't. So I figured it was time to submit an issue so the demand for fixing it could be tracked.

Ideally, you should be using the hook API instead of monkey-patching the CJS loader.

Can you link me to the hook API? Not sure if I'm familiar. And will it solve this problem?

FYI I'm under the impression Node.js supports monkey-patching the CJS loader as a real use-case: https://github.com/nodejs/node/blob/v21.2.0/lib/internal/modules/cjs/loader.js#L1202

@aduh95
Copy link
Contributor

aduh95 commented Nov 20, 2023

I'm not saying it's not a bug, but also no one is working on fixing it, so 🤷‍♂️ Trying to set the correct expectations, it’s the kind of stuff that can be worked around, a fix is unlikely.

Can you link me to the hook API? Not sure if I'm familiar.

https://nodejs.org/api/module.html#hooks

And will it solve this problem?

Maybe? At least that's the goal of that API

@privatenumber
Copy link
Contributor Author

Ah, I thought that was referred to as the Loader API (is there a difference I'm missing?).

In any case, I'm not sure if that advice is actionable. The Hooks API currently doesn't seem to handle CJS loading. And the problem here is that Node is bypassing both ESM hooks and the CJS loader by directly reading from disk.

But I do agree that handling CJS exports seems to be one of the goals of that API, which is why I proposed moving the issue over to the Loaders repo: nodejs/loaders#175

@aduh95
Copy link
Contributor

aduh95 commented Nov 20, 2023

Ah, I thought that was referred to as the Loader API (is there a difference I'm missing?).

Yeah sorry, it’s going through a rename as the term “loader” was a bit too loaded.

In any case, I'm not sure if that advice is actionable. The Hooks API currently doesn't seem to handle CJS loading.

The Hooks API does support CJS loading though, with a few limitations but maybe they won’t be a problem for you.

And the problem here is that Node is bypassing both ESM hooks and the CJS loader by directly reading from disk.

So the bug is that the ESM loader needs to “guess” what are the exports of the imported CJS file before evaluating it. The only source it has is the FS, there’s no way for it to guess that you are going to call Module._compile with a different source in the next tick.
Note that this issue is with the ESM loader, but is not related to the hooks API, so I guess it’s good we’re using different names for those two related yet different two things.

But I do agree that handling CJS exports seems to be one of the goals of that API, which is why I proposed moving the issue over to the Loaders repo: nodejs/loaders#175

The issue you linked to is a duplicate of this one, and doesn’t show a problem with the Hooks API (to my understanding, this other repo is really about designing the Hooks API, despite using the old name, not for tracking issue with the current ESM implementation). If you try to use the Hooks API and find a limitation that’s blocking you, it would make perfect sense to open an issue about it over there (although this repo would also be valid if you’re reporting a bug rather than an API design problem).

@privatenumber
Copy link
Contributor Author

Gotcha! Thanks for the careful explanation and all the info across these issues @aduh95

I just tested this out with the Hooks API now that it supports CJS loading, and it works great!:
https://stackblitz.com/edit/stackblitz-starters-tzjekf?file=hooks.mjs
(StackBlitz doesn't have latest Node so download to run)

Will leave this issue open though since it's still a bug in the ESM loader and for major versions without an alternative approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants