Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Env flag for npm packages to conditionally load WASM binaries instead of native binaries #286

Open
EricSimons opened this issue Aug 4, 2021 · 25 comments

Comments

@EricSimons
Copy link
Member

EricSimons commented Aug 4, 2021

UPDATE

This proposal has changed and instead relies on import/export maps, and has officially landed in Node.js core.

I have kept the original (now outdated) proposal below for future reference:

view original (outdated) proposal

The problem

Web development toolchains are increasingly including binaries for compilers/bundlers as they provide substantially faster performance than JS equivalents. The ecosystem is also increasingly adopting WebAssembly for distributing & executing binaries as it provides far greater security guarantees, are directly inspectable, and "build once and run everywhere".

Today esbuild and swc are the most popular of these new tools, and both release a WASM version in a separate package in lockstep with their native binaries (esbuild-wasm and @swc/wasm). As such, consumers of these packages need to determine at runtime whether they should use the native binaries or the WASM binary.

Parcel v2 currently uses a custom flag process.env.PARCEL_SWC_WASM that indicates it should require the WASM binary and not the platform specific variant. Other toolchains that have adopted swc/esbuild (Vite, Next.js, Snowpack, etc) need a similar way of determining this as well, and custom flags for each package is not ideal.

Proposed Solution: process.env.REQUIRES_WASM

I propose we standardize on a new flag, process.env.REQUIRES_WASM, that instructs npm packages to use WASM binaries instead of unsandboxed platform specific binaries.

This is important for environments that require high security guarantees (i.e. targets of supply chain attacks like enterprises and OSS devs), WebAssembly based containers like WebContainer or kubelet, in-browser playgrounds, and other cases where native binaries cannot/should not be executed.

Indicates WASM is required, not just preferred.

Importantly, this flag does not indicate that the host environment prefers WASM. Instead it is an instruction that the host requires WASM binaries and cannot (or does not) allow execution of unsandboxed native binaries.

Intended functionality

If process.env.REQUIRES_WASM is true at runtime, npm packages should import WASM binaries (or a JS file generated via emscripten, etc) instead of corresponding platform specific binaries.

Should a package not have a WASM binary available (and/or can't polyfill the functionality otherwise), the program should throw an error if process.env.REQUIRES_WASM is present.

@EricSimons EricSimons changed the title Proposal: Env flag for npm packages to conditionally load WASM binaries instead of native binaries Spec Proposal: Env flag for npm packages to conditionally load WASM binaries instead of native binaries Aug 4, 2021
@EricSimons EricSimons changed the title Spec Proposal: Env flag for npm packages to conditionally load WASM binaries instead of native binaries RFC: Env flag for npm packages to conditionally load WASM binaries instead of native binaries Aug 4, 2021
@sokra
Copy link

sokra commented Aug 4, 2021

I would propose to use an exports/imports field condition instead of a (runtime) environment variable.

An exports field condition has the advantage that it's statically known instead of at runtime. This allows to avoid using import() and instead import ... from "..." can be used.

see also

@matthewp
Copy link

matthewp commented Aug 4, 2021

Great idea!

Another angle to this problem is: many of these tools rely on the postinstall hook of npm. One common problem they encounter is that if you run npm install --ignore-scripts, they postinstall hook won't be called and the native binaries won't be installed. Some times users have this on without knowing it (such as in an .npmrc) and file issues about it.

If we could convince npm to make this option the default it would push these types of tools to prioritize wasm and make it the primary way to use their tools.

@evanw
Copy link

evanw commented Aug 5, 2021

(I'm commenting here because I was pinged on Twitter as the author of esbuild, but I don't have any stake in this PR)

If process.env.REQUIRES_WASM is true at runtime, npm packages should import WASM binaries (or a JS file generated via emscripten, etc) instead of corresponding platform specific binaries.

Each binary executable is many megabytes. Solutions which require shipping binary executables for multiple platforms in the same package are very undesirable because then the weight of the package (installation time and disk size) would be huge. The reason esbuild's WebAssembly package is separate from the native package is to reduce the weight of the package. Having that be a run-time switch instead of an install-time switch would increase the weight of the package by a lot, and is likely not something I would ever consider doing for esbuild.

If we could convince npm to make this option the default it would push these types of tools to prioritize wasm and make it the primary way to use their tools.

This isn't ever going to happen for esbuild. WASM has a number of major drawbacks and I would never want it to be the primary way to interact with esbuild. If there were no drawbacks, there wouldn't be a reason to ship native executables. Some of the drawbacks:

  • AFAIK there's still no way to get node to cache the translated WASM assembly code so every time you run it, it's recompiled from scratch. This can be a huge performance penalty over native.

  • Node has a pretty bad bug where the WASM compilation must finish before node is allowed to exit: tiered WebAssembly compilation broken nodejs/node#36616. This causes short runs of large WASM programs (e.g. CLI scenarios) to take way, way longer than they should (up to 40x longer).

  • Depending on the scenario, WASM's 2gb address size limit can be prohibitive and lead to crashes or large slowdowns as available memory gets low. Native binaries do not have this restriction and generally have a better chance of working correctly.

In esbuild's case WASM can easily be 10x slower than native. WASM will never be the default implementation.

Another angle to this problem is: many of these tools rely on the postinstall hook of npm. One common problem they encounter is that if you run npm install --ignore-scripts, they postinstall hook won't be called and the native binaries won't be installed. Some times users have this on without knowing it (such as in an .npmrc) and file issues about it.

I'm currently experimenting with an alternative installation strategy for esbuild that uses optionalDependencies instead of a postinstall script. If it works, that would solve this particular problem without needing any new environment flags and without relying on WASM. See this post for more information: evanw/esbuild#789 (comment). The only drawback is that optionalDependencies doesn't provide a way to have a default "fallback" dependency if none of the optional dependencies apply. There's another proposal from Yarn that attempts to address that need: yarnpkg/berry#2751.

@matthewp
Copy link

matthewp commented Aug 5, 2021

Thanks @evanw, the optionalDependencies trick sounds interesting. Am I reading this right that it would install the wasm version for webcontainer (which needs to run in the browser)?

@matthewp
Copy link

matthewp commented Aug 5, 2021

Another idea, probably for the Node.js security team is, if there was a way to disable 3rd party packages from launching subprocesses that would help discourage the install-binary patterns. Deno has this with --allow-run (this applies to all code, not just 3rd party), and the result is that you see mostly wasm used there.

@matthewp
Copy link

matthewp commented Aug 5, 2021

@EricSimons Given the discussion here and elsewhere, I think standardizing on an export map condition could be the best immediate-impact solution here. stackblitz could run install with --condition=requires_wasm (bikeshed on name) and everything should just work.

@EricSimons
Copy link
Member Author

@sokra @matthewp I think export maps definitely could be a good solution here. It does diverge from how folks are currently handling this issue atm, as both Next and Parcel load the appropriate binary using runtime conditions. I'd be curious to see if these teams would have any reservations on this approach (I imagine Next would be on board as Tobias works on it, but also cc'ing @devongovett for the Parcel team's take).

@evanw
Copy link

evanw commented Aug 5, 2021

Thanks @evanw, the optionalDependencies trick sounds interesting. Am I reading this right that it would install the wasm version for webcontainer (which needs to run in the browser)?

No, it has nothing to do with WASM. I just mentioned it since it's a way of avoiding problems with --ignore-scripts and postinstall scripts. If you need to run esbuild in the browser, then you'll need to explicitly install the WASM version of esbuild instead of the native version.

@EricSimons
Copy link
Member Author

@evanw thanks for taking the time to respond here- some extra perspective/info that might be relevant:

Solutions which require shipping binary executables for multiple platforms in the same package are very undesirable because then the weight of the package (installation time and disk size) would be huge

AFAIK many npm packages are starting to ship included binaries because it removes the CPU time (and often errors) that happen on install and during CI jobs. It also eliminates the need for postinstall, which is ideal from a security POV. I do understand why you don't want to go this route for esbuild though.

Depending on the scenario, WASM's 2gb address size limit can be prohibitive and lead to crashes or large slowdowns as available memory gets low. Native binaries do not have this restriction and generally have a better chance of working correctly.

FYI that this is now 4gb as of May 2020, and wasm64 just hit stage 3 in June which will remove this limit. v8 has already begun work on implementation.

In esbuild's case WASM can easily be 10x slower than native. WASM will never be the default implementation.

As you pointed out in a previous issue, this is in part because esbuild-wasm can currently only run single threaded whereas the native variant has full multithreading. WebAssembly Threads are now available in Chrome/Edge/Firefox and enables high performance multithreading. We use this for WebContainers and the performance has been able to exceed local execution speeds. This might be worth exploring for esbuild-wasm.

Big fan of esbuild and would love to see/help it adopt the latest improvements for the webassembly version!

@devongovett
Copy link

devongovett commented Aug 5, 2021

Isn't this what the "browser" field in package.json is for? It's widely supported by pretty much every bundler, and allows substituting the whole module or specific files for the browser. We already use it for some of our native packages, and I think that was the plan for the rest as well. The environment variable we currently use for other packages is just a temporary hack I believe.

@EricSimons
Copy link
Member Author

@devongovett for bundling use cases definitely, but for environments that provide full Node runtimes requiring strong security guarantees (i.e. targets of supply chain attacks like enterprises and OSS devs, WebAssembly based containers like WebContainer or kubelet, etc) having the browser field affect Node's real resolution algorithm could be problematic for a few reasons:

  • Since the browser field is mostly used for bundling use cases, they often specify overrides that patch Node.js code in a way where it would run in a browser but not in Node
  • Ignoring the above, the host env would have to modify Node's resolution algorithm to account for this, which isn't ideal

So IMO given the advent and growth of WASM adoption in npm packages for local envs, It seems there's a bit of a middle ground we need to accomodate that the browser doesn't seem well suited for.

@devongovett
Copy link

Hmm interesting. Unfortunately, I think WASM has a couple drawbacks over native packages, which is why we distribute native binaries by default.

  • Like @evanw said, WASM is significantly slower in our experience. For our case, this is not due to multi-threading. In Parcel, threading is handled in JS using Node's worker_threads, and the native module runs within those workers (so, single threaded). We've seen WASM run up to 30x slower than native in some cases. In particular, the inability to pass buffers between JS and WASM without copying is a huge problem for us.
  • Some features currently don't work in WASM, e.g. anything involving the filesystem. For our browser REPL, we've reimplemented some of these in JS using a Memory FS layer, but that wouldn't really be the expected way to use Parcel in Node.

For those reasons, I don't think WASM would be the default way to use Parcel, but perhaps could be opt-in. The first issue is fine if you accept the perf hit, but the second I'm not sure how to solve.

@EricSimons
Copy link
Member Author

EricSimons commented Aug 5, 2021

I think opt-in would definitely be the right way to handle this. It seems like there's growing consensus that @sokra's idea of standardizing on an export map condition is the right path for a standardized opt-in, so if that sounds sensible I think I'm going to modify the original issue to reflect an export map condition proposal. Also happy to continue discussing alternative ideas otherwise though 👍

@devongovett
Copy link

Where do export conditions get standardized? In Node core? Looks like node only supports "import", "require", "node", and "default". One of the reasons Parcel has not yet implemented package exports support is that it's unclear to us how additional conditions get standardized so that they actually work the same way across tools.

@IgorMinar
Copy link

+1 to using standardized conditional exports, but that is just a part of the solution.

It would also be great to address the package payload size issue that @evanw brought up. We've struggled with this a bit and ended up using optionalDependencies as the workaround.

Ideally, we could codify/standardize this "non-js extension" part of the package system as well (and update clients not to waste bandwidth, and create noise in the terminal) by honoring the os value specified in package.json (or consider moving the os specifier info to be part of parent's package.json's optionalDependencies, or specified via a new package.json field like platformDependencies (and there are likely other solutions to consider).

The last missing bit is to standardize a wasm-specific os specifier in package.json, for example "os": "wasm".

This combo would ensure wide-applicability of the design (including for wasm environments), while solve the current hurdles that npm packages with native/non-js extensions face.

So to recap, an ideal solution would not only ensure that the (preferably static) module resolution occurs in a platform-aware way but that also the package installation is performed in platform-aware way to improve the efficiency and DX of the package delivery system.

@matthewp
Copy link

matthewp commented Aug 5, 2021

@devongovett Node.js wouldn't be involved in standardizing an export condition unless it planned on using it itself. I don't think this is one they would use. I think this would be a community standard, which like browser is just an informal agreement among tool makers.

@matthewp
Copy link

matthewp commented Aug 5, 2021

Actually I said that and then @bmeck said they might be interested: https://twitter.com/bradleymeck/status/1423375532503207942. Bradley, is nodejs/node the right place to propose a recognized condition?

@bmeck
Copy link

bmeck commented Aug 5, 2021

yes nodejs/node is the right place, just make an issue and/or PR. This seems like it wouldn't be too controversial except getting the right names for condition and how to enable the mode (flag vs config file vs API)

@d3lm
Copy link

d3lm commented Aug 9, 2021

I personally agree that using exports and imports can solve this in an elegant yet flexible way. Ideally there's a standard condition baked into Node.js that we can use for this which would also disable dlopen as @bmeck mentioned already on Twitter.

More specifically we could try to standardize a --no-addons or --no-native-addons (similar to --no-warnings or --no-force-async-hooks-checks) which disables process.dlopen and enables a no-addons or no-native-addons condition.

Open source maintainers could then provide a different entry point given the no-addons condition:

{
  "exports": {
    ".": {
      "no-addons": "./wasm-entry-point.js",
      "default": "./native-entry-point.js"
    }
  }
}

This would also address the performance concerns in a neat way because by default it uses the "native entry point" and only falls back to the WASM version if the condition is provided, either directly -C=no-addons or through a CLI option --no-addons which does a bit more than just providing a condition.

@d3lm
Copy link

d3lm commented Aug 9, 2021

It thought a bit more about this and wanted to propose another option which is a combination of an imports mapping and optionalDependencies.

If a package depends for example on esbuild then a maintainer could conditionally use the WASM versions with an imports mapping:

{
  "imports": {
    "#esbuild": {
      "no-addons": "esbuild-wasm",
      "default": "esbuild"
    }
  }
}

The library could then import #esbuild and it conditionally imports either the WASM or native package. This requires esbuild-wasm to be specified in optionalDependencies. The downside here is that the initial install gets a little slower because all optional dependencies will be installed.

@arcanis
Copy link

arcanis commented Aug 12, 2021

You might be interested in the following RFC for Yarn: yarnpkg/berry#2751. Each approach attempts to solve different goals, but some of them seem shared.

@padmaia
Copy link

padmaia commented Aug 12, 2021

I like the idea of this issue being solved at the package manager level to avoid downloading unnecessary binaries. I'm still digesting the RFC that @arcanis shared, but will comment there if I have any thoughts.

@d3lm
Copy link

d3lm commented Aug 20, 2021

I agree that if this is handled on the package manager level that'd be ideal so we don't have to download packages which won't be used. I think this with a condition makes up a good solution which should cover most use cases.

@Jamesernator
Copy link

I agree that if this is handled on the package manager level that'd be ideal so we don't have to download packages which won't be used. I think this with a condition makes up a good solution which should cover most use cases.

One option would be for Node to allow bare specifiers in "exports", at current you can only refer to files in a package but if you could refer to bare specifiers you could refer to other packages i.e.:

// package.json
{
    "name": "esbuild",
    "exports": {
        "node": {
            "no-addons": "esbuild-wasm"
        },
        "browser": "esbuild-wasm",
        "default": "esbuild-node-native"
    },
    "peerExportedDependencies": {
        "esbuild": "99.x",
        "esbuild-wasm": "99.x"
    }
}

This would require npm to learn about --conditions like Node has though and add some way to specify versions (which I've done as "peerExportedDependencies" in this example).

@Jamesernator
Copy link

Jamesernator commented Aug 26, 2021

An alternative that might be simpler for package managers and still ensures package-lock.json is consistent across environments would be to have "peer dependency groups" i.e. give a list of options, and I need at least one.

In this sort've scheme the onus of specifying the multiple versions is on direct consumers of esbuild not on esbuild itself. i.e. something like snowpack would add into their package.json something like this:

// snowpack/package.json
{
    "peerDependencies": {
        "esbuild": "99.x",
        "esbuild-wasm": "99.x"
    },
    "peerDependenciesMeta": {
        "esbuild": { "optional": true },
        "esbuild-wasm": { "optional": true }
    },
    "peerDependencyGroups": [
        // Declare that at least one of these is required, if none is given
        // then npm would just download the first by default
        { "packages": ["esbuild", "esbuild-wasm"], "required": true }
    ]
}

The one thing is in environments like StackBlitz someone using snowpack would need to add "esbuild-wasm" as a peer dependency, but that might be doable automatically by reading the package.json for certain packages.

EDIT: I've opened an npm rfc for this idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests