Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The divergent specifier hazard and how to prevent or avoid it #2

Closed
wants to merge 3 commits into from
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
361 changes: 320 additions & 41 deletions doc/api/esm.md
Expand Up @@ -219,46 +219,8 @@ The `"main"` field can point to exactly one file, regardless of whether the
package is referenced via `require` (in a CommonJS context) or `import` (in an
ES module context).

#### Compatibility with CommonJS-Only Versions of Node.js

Prior to the introduction of support for ES modules in Node.js, it was a common
pattern for package authors to include both CommonJS and ES module JavaScript
sources in their package, with `package.json` `"main"` specifying the CommonJS
entry point and `package.json` `"module"` specifying the ES module entry point.
This enabled Node.js to run the CommonJS entry point while build tools such as
bundlers used the ES module entry point, since Node.js ignored (and still
ignores) `"module"`.

Node.js can now run ES module entry points, but it remains impossible for a
package to define separate CommonJS and ES module entry points. This is for good
reason: the `pkg` variable created from `import pkg from 'pkg'` is not the same
singleton as the `pkg` variable created from `const pkg = require('pkg')`, so if
both are referenced within the same app (including dependencies), unexpected
behavior might occur.

There are two general approaches to addressing this limitation while still
publishing a package that contains both CommonJS and ES module sources:

1. Document a new ES module entry point that’s not the package `"main"`, e.g.
`import pkg from 'pkg/module.mjs'` (or `import 'pkg/esm'`, if using [package
exports][]). The package `"main"` would still point to a CommonJS file, and
thus the package would remain compatible with older versions of Node.js that
lack support for ES modules.

1. Switch the package `"main"` entry point to an ES module file as part of a
breaking change version bump. This version and above would only be usable on
ES module-supporting versions of Node.js. If the package still contains a
CommonJS version, it would be accessible via a path within the package, e.g.
`require('pkg/commonjs')`; this is essentially the inverse of the previous
approach. Package consumers who are using CommonJS-only versions of Node.js
would need to update their code from `require('pkg')` to e.g.
`require('pkg/commonjs')`.

Of course, a package could also include only CommonJS or only ES module sources.
An existing package could make a semver major bump to an ES module-only version,
that would only be supported in ES module-supporting versions of Node.js (and
other runtimes). New packages could be published containing only ES module
sources, and would be compatible only with ES module-supporting runtimes.
To define separate package entry points for use by `require` and by `import`,
see [Conditional Exports][].

### Package Exports

Expand Down Expand Up @@ -395,6 +357,322 @@ package in use in an application, which can cause a number of bugs.
Other conditions such as `"browser"`, `"electron"`, `"deno"`, `"react-native"`
etc. could be defined in other runtimes or tools.

### Dual CommonJS/ES Module Packages

Prior to the introduction of support for ES modules in Node.js, it was a common
pattern for package authors to include both CommonJS and ES module JavaScript
sources in their package, with `package.json` `"main"` specifying the CommonJS
entry point and `package.json` `"module"` specifying the ES module entry point.
This enabled Node.js to run the CommonJS entry point while build tools such as
bundlers used the ES module entry point, since Node.js ignored (and still
ignores) the top-level `"module"` field.

Node.js can now run ES module entry points, and using [conditional exports][] it
is possible to define separate package entry points for CommonJS and ES module
consumers. Unlike in the scenario where `"module"` is only used by bundlers, or
ES module files are transpiled into CommonJS on the fly before evaluation by
Node.js, the files referenced by the ES module entry point are evaluated as ES
modules.

#### Hazards
GeoffreyBooth marked this conversation as resolved.
Show resolved Hide resolved

When a specifier such as `'pkg'` resolves to different files when referenced via
`require` and `import`, there is a risk of certain bugs that only occur under
these conditions. For example:

<!-- eslint-skip -->
```js
// ./node_modules/pkg/package.json
{
"type": "module",
GeoffreyBooth marked this conversation as resolved.
Show resolved Hide resolved
"main": "./pkg.cjs",
"exports": {
".": {
"module": "./pkg.mjs",
"node": "./pkg.cjs"
}
}
}
```

In this example, `require('pkg')` always resolves to `pkg.cjs`, including in
versions of Node.js where ES modules are unsupported. In Node.js where ES
modules are supported, `import 'pkg'` references `pkg.mjs`.

The potential for bugs comes from the fact that the `pkg` created by `const pkg
= require('pkg')` is not the same as the `pkg` created by `import pkg from
'pkg'`. This is the “divergent specifier hazard,” where one specifer (`'pkg'`)
resolves to separate files (`pkg.cjs` and `pkg.mjs`) in separate module systems,
yet both versions might get loaded within an application because Node.js
supports intermixing CommonJS and ES modules.

Looked at another way, `import pkg from 'pkg'` is a shorthand for `import pkg
from './node_modules/pkg/pkg.mjs'` and `const pkg = require('pkg')` is a
shorthand for `const pkg = require('./node_modules/pkg/pkg.cjs')`. Because the
file paths in the two statements are different, the two `pkg` singletons are
different.
GeoffreyBooth marked this conversation as resolved.
Show resolved Hide resolved

An `instanceof` comparison of the two returns `false`, and properties added to
one (like `pkg.foo = 3`) are not present on the other. One ES module file can
have `import a from 'pkg'` and another can have `import b from 'pkg'`, and `a
instanceof b` returns `true`. That would not be the case when comparing the `a`
from `import a from 'pkg'` with the `b` from `const b = require('pkg')`. This
differs from how `import` and `require` statements work in all-CommonJS or
all-ES module environments, respectively, and therefore is surprising to users.
It also differs from the behavior users have grown familiar with over the last
several years when using transpilation via tools like Babel or
[`esm`][].

It’s not enough to refer to `'pkg'` using only `require` or only `import` within
an application; all of that application’s dependencies also need to use the same
method or the hazard is still present. For example, if an application uses `pkg`
and `pkg-plugin`, and the application references `pkg` via `import` and
`pkg-plugin` references `pkg` via `require`, both versions of `pkg` are
therefore loaded.

#### Writing Dual Packages While Avoiding or Minimizing Hazards

First, the hazard described in the previous section only occurs when a package
contains both CommonJS and ES module sources and both sources are provided for
use in Node.js, either via separate main entry points or exported paths. A
package could instead be written where any version of Node.js receives only
CommonJS sources, and any separate ES module sources the package may contain
could be intended only for other environments such as browsers. Such a package
would be usable by any version of Node.js, since `import` can refer to CommonJS
files.
GeoffreyBooth marked this conversation as resolved.
Show resolved Hide resolved

A disadvantage of such an approach is that the package would not provide named
exports for `import`, which means that instead of `import { name } from 'pkg'`
users would need to write `import pkg from 'pkg'; pkg.name`.

A package could also switch from CommonJS to ES module syntax in a breaking
change version bump. This has the obvious disadvantage that the newest version
of the package would only be usable in ES module-supporting versions of Node.js.

Every pattern has tradeoffs, but there are two broad approaches that satisfy the
following conditions:

1. The package is usable via both `require` and `import`.
1. The package is usable in both current Node.js and older versions of Node.js
that lack support for ES modules.
1. The package main entry point, e.g. `'pkg'` can be used by both `require` to
resolve to a CommonJS file and by `import` to resolve to an ES module file.
(And likewise for exported paths, e.g. `'pkg/feature'`.)
1. The package provides named exports, e.g. `import { name } from 'pkg'`.
1. The package is potentially usable in other ES module environments such as
browsers.
1. The hazards described in the previous section are avoided or minimized.

##### Approach #1: Use an ES Module Wrapper

Write the package in CommonJS or transpile ES module sources into CommonJS, and
create an ES module wrapper file that defines the named exports. Using
conditional exports, the ES module wrapper is used for `import` and the CommonJS
entry point for `require`. A separate export provides ES module sources for
users who know they don’t need to worry about CommonJS version of the package
being used elsewhere in the application, such as by dependencies.

<!-- eslint-skip -->
```js
// ./node_modules/pkg/package.json
{
"type": "module",
"main": "./index.cjs",
"exports": {
".": {
"module": "./wrapper.mjs",
"node": "./index.cjs"
},
"./module": "./index.mjs"
}
}
```

```js
// ./node_modules/pkg/index.cjs
exports.name = 'value';
```

```js
// ./node_modules/pkg/wrapper.mjs
import cjsModule from './index.cjs';
export const name = cjsModule.name;
```

```js
// ./node_modules/pkg/index.mjs
export const name = 'value';
```

In this example, the `name` from `import { name } from 'pkg'` is the same
singleton as the `name` from `const { name } = require('pkg')`. Therefore
`instanceof` returns `true` when comparing the two `name`s and the divergent
GeoffreyBooth marked this conversation as resolved.
Show resolved Hide resolved
specifier hazard is avoided.

This approach is appropriate for any of the following use cases:
* The package is currently written in CommonJS and the author would prefer not
to refactor it into ES module syntax, but wishes to provide named exports for
ES module consumers.
* The package has other packages that depend on it, and the end user might
install both this package and those other packages. For example a `utilities`
package is used directly in an application, and a `utilities-plus` package
adds a few more functions to `utilities`. Because the wrapper exports
underlying CommonJS files, it doesn’t matter if `utilities-plus` is written in
CommonJS or ES module syntax; it will work either way.
* The package stores internal state, and the package author would prefer not to
refactor the package to isolate its state management. See the next section.

If the user is certain that the CommonJS version will not be loaded anywhere in
the application, such as by dependencies; or if the CommonJS version can be
loaded but doesn’t affect the ES module version (for example, because the
package is stateless); then the user could instead use `import { name } from
'pkg/module'` to load the ES module version directly as opposed to the wrapped
CommonJS version.

##### Approach #2: Isolate State

The most straightforward `package.json` would be one that defines the separate
CommonJS and ES module entry points directly:

<!-- eslint-skip -->
```js
// ./node_modules/pkg/package.json
{
"type": "module",
"main": "./index.cjs",
"exports": {
".": {
"module": "./index.mjs",
"node": "./index.cjs"
}
}
}
```

This can be done if both the CommonJS and ES module versions of the package are
equivalent, for example because one is the transpiled output of the other; and
the package’s management of state is carefully isolated (or the package is
stateless).

The reason that state is an issue is because both the CommonJS and ES module
versions of the package may get used within an application; for example, the
user’s application code could `import` the ES module version while a dependency
`require`s the CommonJS version. If that were to occur, two copies of the
package would be loaded in memory and therefore two separate states would be
present. This would likely cause hard-to-troubleshoot bugs.

Aside from writing a stateless package (if JavaScript’s `Math` were a package,
for example, it would be stateless as all of its methods are static), there are
three ways to isolate state so that it’s shared between the potentially loaded
CommonJS and ES module instances of the package:

1. If possible, contain all state within an instantiated object. JavaScript’s
`Date`, for example, needs to be instantiated to contain state; if it were a
package, it would be used like this:

```js
import date from 'date';
const someDate = new date();
// someDate contains state; date does not
```

Since the state is contained within an object instantiated from the package
(`someDate` in this example) rather than the package itself, an application
using this package would pass around references to the instantiated object
when an object with that state is desired. In other words, this file would
`export` `someDate`, and other files in the application would `import` that
rather than the package `date`, unless those other files wanted to create new
objects with separate states. Note also that `new` isn’t required; a
package’s function can also return a new object, or modify a passed-in
object, to keep the state external to the package.

1. Isolate the state in one or more CommonJS files that are shared between the
CommonJS and ES module versions of the package. For example, if the CommonJS
and ES module entry points are `index.cjs` and `index.mjs`, respectively:

```js
// ./node_modules/pkg/state.cjs
module.exports = {
cache: []
};
```

```js
// ./node_modules/pkg/index.cjs
const state = require('./state.cjs');
module.exports.state = state;
```

```js
// ./node_modules/pkg/index.mjs
export * as state from './state.cjs';
```

Even if `pkg` is used via both `require` and `import` in an application (for
example, via `import` in application code and via `require` by a dependency)
each reference of `pkg` will contain the same state; and modifying that
state from either module system will apply to both.

A package utilizing this pattern would not be usable as is in browsers or
other environments that lack support for CommonJS.

1. Write a package where state is stored globally. This is similar to the
previous approach, but instead of isolating state within a shared CommonJS
file it is attached to the global object, e.g.
`globalThis[Symbol.for('pkg@1.2.3')]`. For example, if the CommonJS and ES
module entry points are `index.cjs` and `index.mjs`, respectively:

```js
// ./node_modules/pkg/index.cjs
const state = globalThis[Symbol.for('pkg@1.2.3')];
module.exports.state = state;
```

```js
// ./node_modules/pkg/index.mjs
export const state = globalThis[Symbol.for('pkg@1.2.3')];
```

Like the previous approach, if `pkg` is used via both `require` and `import`
in an application (for example, via `import` in application code and via
`require` by a dependency) each reference of `pkg` will contain the same
state; and modifying that state from either module system will apply to both.

This has the disadvantage of polluting the global namespace, but it is
compatible with non-CommonJS environments such as browsers.

For packages using this approach, regardless of which state management pattern
is chosen, an `instanceof` comparison would return `false` when comparing the
GeoffreyBooth marked this conversation as resolved.
Show resolved Hide resolved
CommonJS and ES module versions of the package or of objects instantiated from
each version. End users of such packages need to be aware of this and avoid
comparing identity in mixed-module system environments, or check against both
versions:

```js
import { createRequire } from 'module';
const require = createRequire(import.meta.url);

import pkgEsModule from 'pkg';
const pkgCommonJs = require('pkg');

export const instanceofPkg = (instantiatedPkg) => {
return instantiatedPkg instanceof pkgEsModule ||
instantiatedPkg instanceof pkgCommonJs;
};
```

Any plugins that attach to the package’s singleton likewise would need to
separately attach to both the CommonJS and ES module singletons.

This approach is appropriate for any of the following use cases:
* The package is currently written in ES module syntax and the package author
wants that version to be used wherever such syntax is supported.
* The package is stateless or its state can be isolated without too much
difficulty.
* The package is unlikely to have other public packages that depend on it, or if
it does, the package is stateless or has state that need not be shared between
dependencies or with the overall application.

## <code>import</code> Specifiers

### Terminology
Expand Down Expand Up @@ -1068,13 +1346,14 @@ success!
[Terminology]: #esm_terminology
[WHATWG JSON modules specification]: https://html.spec.whatwg.org/#creating-a-json-module-script
[`data:` URLs]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs
[`esm`]: https://github.com/standard-things/esm#readme
[`export`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/export
[`import()`]: #esm_import-expressions
[`import.meta.url`]: #esm_import_meta
[`import`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import
[`module.createRequire()`]: modules.html#modules_module_createrequire_filename
[`module.syncBuiltinESMExports()`]: modules.html#modules_module_syncbuiltinesmexports
[package exports]: #esm_package_exports
[conditional exports]: #esm_conditional_exports
[dynamic instantiate hook]: #esm_dynamic_instantiate_hook
[special scheme]: https://url.spec.whatwg.org/#special-scheme
[the official standard format]: https://tc39.github.io/ecma262/#sec-modules