Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Proposal for Node.js Version Management #593

Closed
wesleytodd opened this issue Apr 3, 2024 · 2 comments
Closed

[Draft] Proposal for Node.js Version Management #593

wesleytodd opened this issue Apr 3, 2024 · 2 comments

Comments

@wesleytodd
Copy link
Member

The History

Version Management is something the Node.js project has discussed a few times. Starting in < INSERT HISTORY OF WG HERE >. And more recently with the introduction of corepack for managing versions of package managers. While there has been a volume of discussions and code examples I do not believe there has been a cohesive technical document to bring it all together. This is my attempt at such a doc.

Node.js and the toolchain it ships with is used throughout the SDLC (software development lifecycle) and has different needs at different stages of work. For example, it is highly
uncommon for someone to use a different versions of anything within the same production deployment while is is highly common to do so for local development or CI.

This has led to a the Node.js project focusing on the well understood (but also highly complicated) aspect of simply supporting one locked version of the toolchain across the many
supported architectures. This is in fact a required area of investment for the project as no matter the local dev toolchain a team or developer chooses, when in prod it is always
necessary to support a secure and stable single version workflow.

This focus has left it to the ecosystem to solve for the local and CI workflows where many versions are commonly needed for a variety of tools. And when the ecosystem is in charge
a wonderful variety of solutions emerge.

The Problem

A realistic and productive Node.js development environment cannot be setup with only tools shipped within the current executables. A version manager for both the node
executable and a package manager is required.

The Stakeholders

This proposal requires members of the community to work together. To help enable that, I propose we formalize the Package Maintenance Working Group (PMWG) to own this domain and be a
shared space for Node.js contributors, tooling authors, and interested community members to work. That would help ensure that all stakeholders in the ecosystem had a well
documented and clear idea of who to work with and how that process proceeds.

Once we have chartered the PMWG to take ownership of this domain we can expand on this proposal and other related document and policies around how the Node.js Project intends to
move forward with regards to Package Mangers, version managers, and associated tooling which are vendored or included in some from with the Node.js distribution.

Use Case Scope

We cannot build all things for all people. To help keep focus within this proposal we want to focus on solving for one specific use case:

Users installing Node.js on their local development machine or in CICD process for the purpose of building.

This means we are not intending for this tool to be used to install on production servers. That is not only an already well covered use case, it is also not a place where
dynamically installing many runtime or tool versions is a good or recommended practice.

Goals

  1. A new solution should attempt to reduce redundancy in both concepts and metadata in package.json were possible
  2. Reduce developer toil (as @GeoffreyBooth said: The most important part of this is that it defines the end goals: no lockfile mismatches, no failed builds.)

Prior Art

There is alot here, adding a place holder to remember but not get distracted.

@todo list out more about the current version managers and their approaches.

A Proposal

Node.js ships or endorses (either builds or adopts an existing project) a version manager with the following qualities:

  1. Manage versions of both node and a small subset of ecosystem tooling for local and CI environments
  2. The tool should enhance the user experience for all supported tools in an equitable way (note on this below)
  3. There is a clear and concise process for tools to be included (but as we are not trying to achieve equality, some tools are grandfathered in)
  4. The technical solution (and all others considered) are well documented and clear to stakeholders
  5. The technical solution should be reasonably approachable for our consumers without a lot of new education required

As there is already a lot of prior art in the ecosystem. I will venture to make a clear proposal for an end user experience we hope to achieve without picking an existing tool or
set of implementation details (hence the name nver as a placeholder):

$ curl -o- https://nodejs.org/dist/install.sh | bash
$ node --version # the latest lts version
$ npm --version # the npm version shipped with node
$ mkdir project && cd project
$ npm init -y
$ nver install npm@latest node@20.x
$ node --version # the local node version 20.x
$ npm --version # the local npm version newer than what was shipped with node at 20.x

While this is the "bootstrap from scratch UX" there are many other important workflows which led to the development of corepack which we should also well support. One such is
cloning down an existing project and ensuring that you have a properly setup development environment for working with it. Here is an example for that UX:

$ git clone git@github.com:nodejs/project.git && cd project
$ npm install
This project requires packages to be installed with pnpm@8.15.4 and node@^20.0.0.
We can help you set that up by answering yes to the prompt below, or after selecting
no you can run:

$ nver install

Continue? (yes)
...
$ pnpm install

You may notice in the above that the npm executable noticed that the project was not meant for use with npm and displayed a message to users. This proposal would include some
collaboration by our esteemed colleagues at the various supported packages managers to implement a common stub which would be maintained by the Node.js project which implements a
standard check. The inclusion of this stub would be a requirement for being included in the tooling. To achieve this we have been discussing a proposal in the Package Metadata Interop Collab
Space to add a devEngines field. More on that below.

And lastly, most developers work on more than one project at a time. It is important for their productivity that it is easy to switch between projects and their associated tooling.
Here is an example of working on multiple projects at once:

$ cd project1
$ npm it # install and test (assuming project is already configured or using defaults)
$ node --version
v20.11.1
$ cd ../project2
$ npm it
This project requires packages to be installed with yarn@1.22.21 and node@^18.0.0.
We can help you set that up by answering yes to the prompt below, or after selecting
no you can run:

$ nver install

Continue? (yes)
...
$ yarn install && yarn test
$ node --version
v18.19.1
$ cd ../project1
$ npm test # seamless switch back to node version

Declaring development engines

Currently the only widely used way to declare development tooling is the packageManager field. This only covers one aspect of the problem (the package manager) and is not
designed in a way to allow for a more holistic approach more similar to the pre-existing engines field. In reality many projects have more broad requirements for development
tooling which they document in their project readme files or docs. Additionally, all projects which have native dependencies require an implicit version of python based on their
node-gyp version. These things are not covered under the current implementation of packageManager and they typically do not cause as much user issues when discrepancies arise
because they are less common and not "the first thing people run" but they are equally important to get correct.

To this point, I propose we work toward a slimmed down version of the proposed devEngines field which covers explicitly declaring these required development tools. I say slimmed
down simply to differentiate it from the version currently documented in the issue
(as of April 1 2024). The reason I propose slimming it down is to prevent the initial specification from defining ways tools should fetch these dependencies (aka dropping the
download section). Here is what I think we should adopt for a first pass:

// package.json
{
  "devEngines": {
    // os: '', as specified in original proposal
    // cpu: '', as specified in original proposal
    // runtime: '', as specified in original proposal
    packageManager: {
      name: 'npm', // names which we would specify and map to individual known projects
      version: '', // this would be semantically left open from the spec, but should be considered the equivalent of npm package specifiers
      // see: https://github.com/zkat/ssri/blob/latest/README.md#--ssriparsesri-opts%E2%80%94integrity 
      resolved: {
        version: '1.0.0',
        uri: 'https://registry.url/path/to/tarball.tgz',
        // this would be the output of ssri
        digest: '...'
      }
    }
  }
}

With this slimmed down version, we can test out approaches with different tooling implementations for how we resolve, fetch, and install the dependency without forcing the hand of
tool authors into one single direction. If they wanted to limit that version cannot be a git repository, that would be acceptable even though npm defines that as valid for a
package specifier.

Technical Approachs

There are three primary approaches which version managers can (and have, see examples) take to solving this problem:

  1. PATH manipulation
  2. In-process indirection
  3. Jumper Binaries

We will describe each in the section below and attempt to well document the pros and cons of each approach.

PATH manipulation

This approach is the most commonly currently deployed approach to binary version management. It is how nvm and other popular tools work.

Pros:

  • Separation of concerns: keeps the version detection and indirection out of the main executable code

Cons:

  • Support for the wide range of OSs and shells is a challenge

In-process indirection

This approach uses a stub at the start of the program to load in the concrete implementation of the executable. This approach is the current corepack
design
and also has prior art in coffeescript (see example).

Pros:

  • Separation of concerns: keeps the version detection and indirection out of the main executable code

Cons:

  • It is slower than path manipulation on the most common path but still faster than jumper binaries
  • Being in the main executable means even in situations where there is no need (production) the code is still run

Links

Jumper Binaries

Jumper binaries are specially crafted executable entry points which find the correct binary to run as a child process. This was the original corepack design.

Pros:

  • Separation of concerns: keeps the version detection and indirection out of the main executable

Cons:

  • It is slow (comparatively)
  • Breaks tooling and user expectations of where to find the binary/source being run

Considerations Needing to be Addressed

  1. Node Binary install locations
  2. Bootstrap phase cannot require node, needs to be cross platform
  3. How do package managers do global installs? How do they perform npm link across node versions, etc.
  4. Similar to the above, how does it handle installing npm/package managers across node verisons.
  5. Opinions on installing versions. We should not have any hard ones (aka install any node version they ask for, even insecure ones with a flag)

NOTES:

equal vs equitable: Equality means each individual or group of people is given the same resources or opportunities. Equity recognizes that each person has different circumstances and allocates the exact resources and opportunities needed to reach an equal outcome.

@darcy: I need to fit this all in here and I am not sure how and it was easier to start just copying it wholesale to start.

Notably, I have not been in favor of: "jumper" binaries, net-new fields (like packageManager OR even devEngines - although I'm still participating with those discussions) or even the idea of a new path manip tool to manage node (as npx already does this). All of these are misguided imo (ie. why I have historically & continue to pushback on distributing corepack & don't think a new "node version manager" which uses similar concepts would be any better).
Things everyone should understand &/or align on:
There is value codifying your project's package manager & node version (this is widely agreed on)

  1. Developers have put that information in engines historically (this is debated by a vocal minority & I do not agree with them - here's 1.3M examples)
  2. Package managers have supported checking the package manager version & node version from engines since it was first introduced by npm more then a decade ago (this is a fact)
  3. If we all agreed on the above then we would likely avoid the pursuit of adding new fields. I have & continued to debate these points with individuals even though they are basically all facts (imo). Could the state of prior art improve? 100%. If anything is unclear about the status quo, that will not be improved by creating a new place to store the same information.
    With that in mind, lets distill the core problem Corepack aims to resolve (relevant I swear):

Developers have many projects which use many different package managers & versions on their machines & run into problems when they use the wrong package manager or version for their projects.
Lets distill the core problem a new "node version manager" would aim to resolve:
Developers have many projects which use many different node.js versions on their machines & run into problems when they use the wrong node.js version for their projects.
Kind-of specific, so lets take a step back because we know node.js isn't the only "runtime". What would a "runtime manager" aim to resolve:
Developers have many projects which use many different runtimes & versions on their machines & run into problems when they use the wrong runtime or version for their projects.
Feeling like there's some similarities here...
Let's distill the core problem npm aims to resolve:
Developers have many projects which use many different dependencies & versions on their machines & run into problems when they use the wrong dependencies or version for their projects.
Conjecture(s):

  1. A runtime is a dependency
  2. A package manager is a dependency
  3. Developers want to manage their dependencies the same way
    Question(s):
  4. How do we add value creating a tool to manage a subset of developer's dependencies? (ie. does that not compete with conjecture Schedule First Meeting #3)
  5. How is this different then the existing, default, dependency manager? (ie. npm)

@styfle:

  1. User defines expected package manager and version in a canonical location such as package.json
  2. Any user/CI/CD/etc that runs the package manager automatically installs the selected version prior to running install
  3. When cloning a repo, you should never see a mismatch lockfile because the same version was used each time
  4. When running in CI/CD, regardless of platform, you should never have a failed build due to mismatch package manager
@GeoffreyBooth
Copy link
Member

(as @GeoffreyBooth said: The most important part of this is that it defines the end goals: no lockfile mismatches, no failed builds.)

To be clear, I don’t think it should be Node’s responsibility to fix bugs in package managers. If pnpm changing lockfile schema in patch versions annoys pnpm users, that should be on pnpm to fix. Reproducible builds are ultimately the responsibility of a package manager: we can pin a package manager version, for example, but if the package manager itself doesn’t support lockfiles or doesn’t use one or the user doesn’t provide one, then there won’t be a reproducible build. npm install updates the lockfile, for example; it’s not a way to get a reproducible build. But we’re not trying to force that to change somehow; it’s not our job to control how package managers work, or fix the issues that users may have with them.

That isn’t to say that we can’t or shouldn’t ship a version manager for package managers, but the reason to do so is for the more holistic reason of giving the user control over their project’s dependencies, like they already have for the software defined in package.json dependencies and devDependencies. And maybe that control is to avoid bugs in particular versions of those software, and to achieve reproducible builds if the package manager is used correctly, but that’s more the user’s concern than a goal of ours. We would need to ship and control our own package manager if we wanted to ensure that the goals of well-behaved lockfiles and reproducible builds are met; but unless we want to build and ship our own package manager or limit which package managers and versions can be chosen, I think all we can really do is provide the tools for the user to achieve these goals if they choose a package manager capable of achieving them and use that tool correctly.

@wesleytodd
Copy link
Member Author

Closing in favor of this as a PR in #594. Sorry for the last minute thing, I just didn't get a chance until now to clone down the repo on this laptop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants