Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide reproducible build by default #399

Closed
mcollina opened this issue Feb 26, 2024 · 21 comments · Fixed by #413
Closed

Provide reproducible build by default #399

mcollina opened this issue Feb 26, 2024 · 21 comments · Fixed by #413

Comments

@mcollina
Copy link
Member

Corepack does not allow for reproducible builds by default. Consider a scenario where there is corepack enabled by default, i.e., the jump files are in place.

I run:

  1. npm init -y
  2. yarn

The result does not include a packageManager field in my package.json.

This makes the build non-reproducible for newbies as they do not know that they should run corepack use yarn@4.

This is mostly a problem in Dockerfiles, because most newbies completely ignore the corepack use step.

@styfle
Copy link
Member

styfle commented Feb 26, 2024

I'm in favor of this change 👍

However, I think it was originally suggested by node core to have package manager global autoupdate so that node could avoid handling CVEs for package managers.

Maybe @aduh95 or @arcanis or @mhdawson remembers the context.

@aduh95
Copy link
Contributor

aduh95 commented Feb 26, 2024

To be fair, running corepack use yarn@4 would not create reproducible builds either, because it would pick whichever is the latest version on the 4.x line from Yarn registry.

@styfle
Copy link
Member

styfle commented Feb 26, 2024

To be fair, running corepack use yarn@4 would not create reproducible builds either, because it would pick whichever is the latest version on the 4.x line from Yarn registry.

@aduh95 Thats true if you run the command in the Dockerfile but I think what @mcollina is pointing out is that the "newbie" should run that command locally and thus the package.json would be updated and make a reproducible build once COPY package.json . line is reached in the Dockerfile.

@mcollina
Copy link
Member Author

However, I think it was originally suggested by node core to have package manager global autoupdate so that node could avoid handling CVEs for package managers.

How does this relates? The problem with node is to have in-tree dependencies (that causes cve chores) as well as constant updates of the latest version of said package manager.

What I’m saying is that corepack should write down the packageManager field on first usage of a package manager loaded via corepack. This allows for future reproducible builds. (And possibly more chores for dependabot to keep this up to date).

@styfle
Copy link
Member

styfle commented Feb 26, 2024

How does this relates?

I thought you were referring to this feature, #364, since it would prevent "reproducibly"

@styfle
Copy link
Member

styfle commented Feb 26, 2024

What I’m saying is that corepack should write down the packageManager field on first usage of a package manager loaded via corepack.

What if there is no package.json in the directory?
For example, my first usage might be in /tmp when I run pnpm --version?

@mhdawson
Copy link
Member

However, I think it was originally suggested by node core to have package manager global autoupdate so that node could avoid handling CVEs for package managers.

The main concern is that Node.js should never, never, ever need an update for the end user to address a security vulnerability in one of the package managers installed by corepack. So having it install a locked version would not be acceptable, but having it record what it installed and re-using that would be ok assuming there was a way and it's well documented how the end user can force the version updatred to a newer version without having to update their Node.js version.

@aduh95
Copy link
Contributor

aduh95 commented Feb 27, 2024

Any chance you’d have an example of such Dockerfile that would be broken? To be sure we’re on the same page, as I’m not sure I understand what is the issue.

@arcanis
Copy link
Contributor

arcanis commented Feb 27, 2024

I'm a little unsure whether this is Corepack's concern or individual package managers ... I don't know how I feel with potentially mutative operations on the package.json when the user didn't ask for them. For instance, with this feature, just running yarn --version would add a "packageManager": "yarn@x.y.z" when the user didn't ask for it. By contrast, package managers would know what is an "install command" (which are typically expected to mutate the project files) and could add the packageManager field themselves if missing.

Additionally, there's a risk of breakage for people using multiple package managers together (for instance someone running yarn install then npm run), as Corepack would add the packageManager check and would then block the subsequent npm run calls. That's the reason why we went with Corepack being all-tolerant on projects which don't include a packageManager field - it feels the safest course of action to guarantee minimal breakage on the initial release.

Overall I'm not opposed to this, but I feel like it shouldn't be a blocker for the initial release. It comes with risks, and I'd feel safer to have those risks to be spread over a separate major. Not everything needs to be lumped together - once we have the tool in place we're free to make other changes as needed, following traditional Node.js minor/major considerations.

@mcollina
Copy link
Member Author

@aduh95 the example in the pnpm guide does not guarantee reproducible builds: https://pnpm.io/next/docker#example-1-build-a-bundle-in-a-docker-container.

When Corepack will be enabled by default, the default usage of Node.js will create non-reproducible builds. I think this is critical to fix before enabling by default.

FROM node:20-slim AS base
ENV PNPM_HOME="/pnpm"
ENV PATH="$PNPM_HOME:$PATH"
RUN corepack enable
COPY . /app
WORKDIR /app

FROM base AS prod-deps
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --prod --frozen-lockfile

FROM base AS build
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --frozen-lockfile
RUN pnpm run build

FROM base
COPY --from=prod-deps /app/node_modules /app/node_modules
COPY --from=build /app/dist /app/dist
EXPOSE 8000
CMD [ "pnpm", "start" ]

Overall I'm not opposed to this, but I feel like it shouldn't be a blocker for the initial release.

Agreed. Corepack is part of Node.js already. Marking it stable/enabling it by default is another matter, we are not talking about an initial release.

It comes with risks, and I'd feel safer to have those risks to be spread over a separate major. Not everything needs to be lumped together - once we have the tool in place we're free to make other changes as needed, following traditional Node.js minor/major considerations.

My 2 cents is that the risk of not reproducible builds by default is higher. If Corepack needs more time to figure it out, let's wait until it's ready.

@aduh95
Copy link
Contributor

aduh95 commented Feb 27, 2024

@mcollina the issue would manifest itself if pnpm were to release a new version between two execution of the Dockerfile, correct? Also, is my understanding correct that you are not suggesting that Corepack inside the Dockerfile to mutate the package.json, you’d like the packageManager field to be present before the container is started, right?

@mcollina
Copy link
Member Author

the issue would manifest itself if pnpm were to release a new version between two execution of the Dockerfile, correct?

Yes. Specifically, if the packageManager field is not set, it can even use different majors.

you’d like the packageManager field to be present before the container is started, right?

No. The packageManager field must be present before the image is built (but I guess this is what you meant).

@mcollina
Copy link
Member Author

By contrast, package managers would know what is an "install command" (which are typically expected to mutate the project files) and could add the packageManager field themselves if missing.

@arcanis a solution could be to have it as a requirement for a package manager to be handled by corepack. I'm not tied to a specific solution. I'm mostly concerned that a new major version of a package manager is not slipped into a docker image without me knowing, breaking my build.

@styfle
Copy link
Member

styfle commented Feb 27, 2024

I like the idea of Corepack tracking <pm> i or <pm> install command calls and then writing the packageManager field (if package.json exists) before invoking the package manager.

This would be very similar to how a lockfile is written the first time you install and then subsequent installs read the lockfile.

I do think we need to consider if there are cases where someone doesn't want this to happen. Probably the same people who don't like lockfiles will be the same people who don't want packageManger pinned in their package.json file.

@pi0
Copy link

pi0 commented Feb 27, 2024

I share this pain very much. Not for newbies but also for any automation that it is critical for it to use the right command.

Writing packageManager field on first use and making it essential really makes sense 💯 💯 💯


(off topic but only mentioning as I saw this tweet about stabilize requirements)

Probably Other than auto, is there any hope that corepack <install|run|x> become a thing to automatically run the configured command. We made nypm and ni to overcome but if corepack could come to aid all users that would be a miracle for whole ecosystem.

@merceyz
Copy link
Member

merceyz commented Feb 27, 2024

Corepack does not allow for reproducible builds by default

Isn't that what you requested in #93 @mcollina?

@mcollina
Copy link
Member Author

Corepack does not allow for reproducible builds by default

Isn't that what you requested in #93 @mcollina?

The two requirements are orthogonal.

@subesokun
Copy link

I'm still new to corepack but I would expect the following to happen if packageManager is not present in the package.json

  1. It will use the latest major version of the package manager
  2. Pins the major version in the packageManager property within the package.json
  3. Pins the exact version in the packageManager property within the lockfile

On the next install it should then use the version as defined in the lockfile. And tools like Dependabot shall update the package manager version in the lockfile according to the allowed version range in the package.json.

@GeoffreyBooth
Copy link
Member

3. Pins the exact version in the packageManager property within the lockfile

Corepack is writing data inside lockfiles? How does that even work with the various different lockfile formats out there?

Before we have Corepack adding packageManager fields we might want to wait and see how the new field design plays out, per #402 (comment) and openjs-foundation/package-metadata-interoperability-collab-space#15.

@subesokun
Copy link

Yes, probably the other package managers won't like it if somebody else is modifying the lockfile content or is adding any kind of additional metadata. If you delete the package manager lockfile, you might also not expect that you just deleted the pinned package manager version too.

Just wanted to say that, as a naive user, I'd expect that any version definition within packageManager behaves like dependencies in the package.json. And that if I use a semver version range, the package manager version will be pinned for me (if not pinned already) and that subsequent installs will use that version too. Where to store this lock information I can't tell, but maybe a dedicated lockfile might be an idea.

@arcanis
Copy link
Contributor

arcanis commented Feb 28, 2024

There's no need for a lockfile here, the packageManager field intentionally only supports exact versions, not ranges (you can however run corepack use yarn@^2 if you wish, and it'll lock the field ... in packageManager itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants