Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: In an ESM-first mode, how should a package.json file with no type field be handled? #49494

Open
GeoffreyBooth opened this issue Sep 4, 2023 · 13 comments
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. module Issues and PRs related to the module subsystem.

Comments

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Sep 4, 2023

Building off of #49432, #49295 (comment) and #31415, we’re considering a new mode, probably enabled by flag, where all of the current places where Node defaults to CommonJS would instead default to ESM. One of the trickiest questions to answer for defining such a new mode is how to handle package.json files that lack a type field.

A package.json file, whether or not it contains a type field, defines a “package scope”: the folder that the package.json file is in, and all subfolders that don’t themselves contain a package.json file. Within this package scope, currently a package.json containing "type": "module" will cause .js files to be interpreted as ES modules; a package.json file containing "type": "commonjs" or no "type" field will cause .js files to be interpreted as CommonJS modules.

In a naïve “just flip all the defaults” implementation, where one literally goes through the Node codebase and symmetrically reverses everywhere that we default to CommonJS to instead default to ESM, a package.json lacking a type field would cause all the files in that scope to be treated as ES modules. The problem with this is that there are lots of popular dependencies that people install that have no type field. Most real-world apps would fail to run in an ESM-first mode that behaved this naïve way, because it would be rare for every dependency of an app to contain a package.json with a type field.

To make an ESM-first mode that’s actually usable, we need to find a solution for this problem. As I see things, the solutions fall into two categories: one where we preserve the pure symmetrical “no type = ESM” behavior, and one where we don’t. Here’s a running list that I’ll update if people suggest additional ideas:

1. Preserving symmetry: no type field is interpreted as ESM

  • a. After installing packages, users would run a script that patched any dependencies’ package.json files to add "type": "commonjs" wherever the type field wasn’t specified.

    • This might upset package authors, as their packages are getting modified by the user which might cause bugs. This also goes against a lot of the work the npm team has done in recent years around reproducible builds and immutable packages. Ideally a patch script such as this would be part of the npm install command and its equivalents in other managers, but it’s probably unlikely that we would get support for such an approach from many (or any?) of the popular package managers.
  • b. After installing packages, users would run a script to warn them if any of their packages lack a type field. (Or as part of the package installation command, the command would error on attempting to install any type-less package. This would presumably be an option that users would enable.) Then users would presumably uninstall that package in favor of some alternative (or choose to patch it).

    • This would result in user-driven pressure on package authors to republish their packages with the type field added, even if just "type": "commonjs", which it’s probably safe to assume would rankle many package authors. Besides those authors complaining that we’ve pushed a requirement onto them, they may reasonably argue that a type field makes no sense for packages intended for non-Node environments such as browsers.
    • This only really works when adding a particular package for the first time to an app. If a user is trying to migrate an old app to run in ESM-first mode, any type-less package would need to be patched or upgraded to the latest version (assuming the author has kindly published a new version with the field). This option creates a lot of friction for both users and package authors.
  • c. The npm registry starts requiring the type field in order for packages to be published, just as they already require the name and version fields.

    • I think it’s unlikely that the npm folks agree to this, as there are countless packages not intended for use in Node, and it would be unclear what type field value those packages should have; and npm surely doesn’t want to put themselves in the position of getting many package authors angry at them.
    • This still doesn’t solve the problem of what to do about all the existing packages published without a type field, as the npm registry strictly disallows old packages to be modified.

2. Different behavior in ESM-first mode

Assuming that no workable option for preserving symmetry is found, the question then becomes “okay, now what?” Here are some options, that I’ll update as people comment:

  • a. Keep current behavior. All type-less package scopes are still treated as CommonJS.

    • This means we aren’t really ESM-first, at least not fully, and we fail to achieve one of the primary goals of providing an ESM-first mode: that a new user can download Node and just start coding using ESM syntax without needing to opt into it somehow.
  • b. Error on type-less packages.

    • This would be the runtime equivalent to the option above where the package manager errored or warned on installing a package that lacked a type field. It presents the same problems: users would need to patch, or pressure the package authors to update their packages.
  • c. Under a node_modules folder, type-less packages are treated as CommonJS; but are treated as ESM otherwise. This follows the precedent that the ESM resolution algorithm already special-cases folders named node_modules.

    • This would come quite close to a “just works” experience: users could run the current npm init, assuming it never changes, and still write ESM code for their app without needing to enable it somehow; and the user could npm install any dependency which should “just work” like today.
    • Package managers other than npm might need to adjust to support this behavior, if they don’t save packages in subfolders under node_modules. Package managers that save packages in a cache folder might create a node_modules folder at the root of their cache and put the packages one level down inside that, for example. But some package managers might have trouble working with this behavior.
  • d. The “no type means ESM” behavior only applies to the package scope of the entry point. So in an app folder with app/package.json (that lacks a type field) and app/entry.js, running node --experimental-flag-for-esm-first-mode-name-tbd entry.js would interpret entry.js and any other files in that package scope as ESM, but all other package scopes anywhere on the disk would be interpreted as CommonJS.

    • Compared to the previous option this might fix some of the non-npm package managers, but at the cost of the app possibly not being statically analyzable by tools. It would be ambiguous whether a particular package scope will be the one that an entry point uses and would therefore be acquiring this new behavior, unlike “is it under a folder named node_modules“ which is easily determined by external tools.

Any other ideas? Or additional pros/cons to any of these suggestions. @LiviaMedeiros @nodejs/loaders @nodejs/wasi @nodejs/tsc

@GeoffreyBooth GeoffreyBooth added module Issues and PRs related to the module subsystem. esm Issues and PRs related to the ECMAScript Modules implementation. labels Sep 4, 2023
@ljharb
Copy link
Member

ljharb commented Sep 5, 2023

that a new user can download Node and just start coding using ESM syntax without needing to opt into it somehow.

they already can; the file just needs the .mjs extension. If the .js extension not being ESM is a problem for that, then so is the .cjs extension not being ESM (just to forestall that rebuttal)

@guybedford
Copy link
Contributor

guybedford commented Sep 5, 2023

Thanks @GeoffreyBooth for following these threads. I've had a couple of conversations recently where people unfamiliar with JS that weren't naturally guided to the step of just setting up a { "type": "module" } package.json. Given that npm isn't prioritizing making this easier, and given that for a beginner it's still not an obvious path to first-class ESM, I tend to think that this is becoming important for the project to show it provides a first-class ESM experience out of the box.

I also think the CLI / extensionless use case is important to finally crack, either as part of this or separately. When adopting a new top-level default it does give us the option to change the legacy compat paths to solve it.

My preference would be one of 2. (c) or 2. (d):

  1. c) Under a node_modules folder, type-less packages are treated as CommonJS; but are treated as ESM otherwise. This follows the precedent that the ESM resolution algorithm already special-cases folders named node_modules.

Or

  1. d) The “no type means ESM” behavior only applies to the package scope of the entry point.

With either, a node --type=module cli.js just working as ES modules, or being able to write a new app and use node --type=module app.js, where that flag has a future to be made the default in a future version (after subsequent deprecations, and leaving behind node --type=commonjs app for legacy entry points) is a very compelling case.

Even if such a process takes many years, setting up the path now with a flag would be very valuable. And even if we don't get it right first time, we can at least figure out what flag works best for users by iterating on this work given it will be entirely experimental and optional. Let most users ignore it for the however many years it takes to get it stabilized, but it would very much close the loop on the ESM integration IMO.

@mcollina
Copy link
Member

mcollina commented Sep 5, 2023

Thanks for the great writeup @GeoffreyBooth.

I think purposefully breaking modules that have been working untouched for years is against what our users expect from us.
It's going to break every application, npx workflow, and generic usage of Node.js. It's going to create a massive churn for maintainers, frustrating them further and pushing them away.

Here is an interesting data point: developers still use readable-stream v2.3.7 (60 millions weekly), v2.3.8 (35 millions downloads weekly), v3.6.2 (30 millions downloads weekly), while v4 is at 3 millions downloads.
Dependencies are pinned somewhere in the chain, and they do not update. The massive ecosystem in the registry is what made Node.js special: let's not burn it.

If we leave the default "type": "commonjs", shipping a ESM-first mode will require a change to npm init that should (ideally) get its default from the runtime itself.

I think they will be open to that case, given that we are protecting their users. Worst case scenario, we can easily we can patch it like we do with V8.
We are npm primary distribution channel after all and only few users update it outside of Node.js.

In other terms: let's not break everybody, please.

@GeoffreyBooth
Copy link
Member Author

My preference would be one of 2. (c) or 2. (d):

I think I’m leaning toward 2-c, where typeless package.json files under node_modules keep the legacy CommonJS behavior but package.json files elsewhere get the new behavior. This both preserves backward compatibility for dependencies while still achieving the “do nothing to opt in” goal. I think it’s slightly preferable to 2-d, only special-casing the entry point’s scope, because of the static analyzability. It might have slightly more performance impact to check for node_modules, but I think we’re pulling the full path of every package.json anyway and it’s trivial to see if that path contains a node_modules segment? Which we might already be checking for already as part of the existing resolution algorithm, I’m not sure.

There are two variations we could add onto this:

  1. In this mode, if a package.json file is present, type is required. So a new user starting a project and running npm init as it exists today and then creating app.js and running node --new-flag-tbd app.js would get an error that they needed to add a type field to their package.json. They do so and continue on. (Or ideally npm init finally gets updated, but I don’t think we should hang all our hopes on that not least because who knows how users might be initializing projects: they could run a command like npx create-foo-app or paste in an example package.json from a tutorial, etc.)

  2. In this mode, if the entry point triggers a SyntaxError, we print an error message like “If app.js is a CommonJS file, using require syntax, add "type": "commonjs" to ./package.json“. This is less forceful than mandating a type field, so the case of the new user writing ESM from scratch avoids any interruption; and it’s the same amount of friction for application or package authors upgrading CommonJS apps. We could even try to get a bit intelligent with our error, like if the error is undefined require then suggest "type": "commonjs", if it’s unrecognized keyword import then suggest "type": "module", etc., with a generic fallback when we can’t tell.

@ljharb
Copy link
Member

ljharb commented Sep 5, 2023

What happens in this proposal if a file, in a typeless project, is a symlink outside of node_modules, pointed to a file inside node_modules inside a typeless package dir?

@ljharb
Copy link
Member

ljharb commented Sep 5, 2023

Also, since type module doesn't make extensionless files ESM, would this mean that installed packages with an executable could never use extensionless ESM, only the top-level project? (i'd thought the primary motivating use case for this recent discussion was "extensionless files as ESM")

@mcollina
Copy link
Member

mcollina commented Sep 8, 2023

Thinking about this after a few night of sleep, a slightly better solution is to require type to be set (or at least warn to do so). Otherwise we could end up a situation that a maintainer author a module as ESM and then it's interpreted as commonjs.

@GeoffreyBooth
Copy link
Member Author

Otherwise we could end up a situation that a maintainer author a module as ESM and then it’s interpreted as commonjs.

It’s a question of who we want to inconvenience: the package author who forgets to test their package as imported by an app, or the new user trying to follow a tutorial and potentially not knowing what a package.json file is. Personally I think we should prioritize the new users, as package authors are by definition more experienced in the platform and should be more aware of what they need to do; and their package failing to parse is such an obvious failure that it would be noticed immediately: either by them, or the first person who tries to use their package after it’s published.

Ultimately either case could be completely addressed by a package manager. It’s package managers that edit package.json, after all, every time you run a command like npm install foo. When running npm install on a package lacking type, npm could prompt the user “Is the code in this folder written using import/export (ES module) syntax or require (CommonJS) syntax?” and add the appropriate type field. That would in practice mean that the field is almost always present, for both library authors and application developers.

@arcanis
Copy link
Contributor

arcanis commented Sep 8, 2023

It’s package managers that edit package.json, after all, every time you run a command like npm install foo. When running npm install on a package lacking type, npm could prompt the user “Is the code in this folder written using import/export (ES module) syntax or require (CommonJS) syntax?” and add the appropriate type field.

It's very unlikely we would implement something like that in Yarn. Packages are by default loaded directly from their cache archives, meaning we don't get a chance to "patch the package.json" (and it's in part by design, because we want users to use exactly the source files they've been given). Doing something like what you suggest would break this model, for what seems to be a very low gain imo.

@GeoffreyBooth
Copy link
Member Author

GeoffreyBooth commented Sep 8, 2023

Doing something like what you suggest would break this model, for what seems to be a very low gain imo.

Sure, and as I’ve said earlier, I think we can’t assume any cooperation from package managers (not out of ill intent, but because they have their own goals and they’re fragmented). My point is just that package managers have a role to play in ensuring good UX too; they could fix this problem very easily if they want to. For example, on publish a package manager could be like “hold up, you’re missing a type field, is that an accident?” That would be a much more targeted solution to the problem of accidentally publishing a typeless ESM package, and if npm were part of the Node project then I think we would be asking why enforce something at runtime that we could enforce at publish time.

Regardless, if we leave package managers out of it and focus just on Node, saving a package author from a bad release also feels like a very low gain, when the cost is making things harder for new users. (Which is a choice we need to make only if package managers do nothing.)

@ljharb
Copy link
Member

ljharb commented Sep 8, 2023

Both install and publish are often performed headlessly, so there's not much opportunity for prompting users there.

@arcanis
Copy link
Contributor

arcanis commented Sep 8, 2023

very easily

My point was: it would not be "very easily" 🙂

If you want to achieve the goal of allowing beginners to write .js scripts in ESM, I'd focus on scoping your research to this exact goal (ie only apply this logic if there isn't a package.json at all). Changing type semantics in any way is imo guaranteed make the migration to ESM more frustrating to people who actually work on Node.js projects, I don't imagine that as a good tradeoff.

For example, on publish a package manager could be like “hold up, you’re missing a type field, is that an accident?”

That's something I'd be more supportive about, since it doesn't break the existing ecosystem, only affecting future packages in a consistent fashion.

@GeoffreyBooth
Copy link
Member Author

That’s something I’d be more supportive about, since it doesn’t break the existing ecosystem, only affecting future packages in a consistent fashion.

I think part of this is that we’re just guessing how package managers respond; and how new users will act. But we don’t need to get it right on the first try: it’s an experimental flag. We can start with the “typeless = ESM” behavior at first and change it to the “type required” behavior later, before unflagging, once we see if (or how) package managers adapt to the existence of this new mode. I’m not strongly opposed to making Node require the type field, but it feels like a hammer to a solution that needs a mallet, and something that hopefully we won’t need at all if package managers find a way to ease this UX pain point. (And yes, they’re the ones best positioned to decide on that solution if there is one, however easy or hard it might be 😄 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. module Issues and PRs related to the module subsystem.
Projects
None yet
Development

No branches or pull requests

5 participants