Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only symlink the folder instead of hardlinking every single file #3109

Open
aminya opened this issue Jan 28, 2021 · 14 comments
Open

Only symlink the folder instead of hardlinking every single file #3109

aminya opened this issue Jan 28, 2021 · 14 comments

Comments

@aminya
Copy link

aminya commented Jan 28, 2021

I have noticed that pnpm hardlinks every single separate file separately. This results in lower performance when the package is required because the realpath function should be called every time for each single file.

I am proposing to only symlink the top-level folder to fix this issue.

pnpm version: all

Expected/Actual behavior:

These should be real files, not hardlinks. Only the top-level folder should be symlinked:
image

Additional information:

  • node -v prints: 15
  • Windows, macOS, or Linux?: win 10

An example of lower performance is in Parcel builds is this. Using yarn results in ~3s faster build times.

Build time:

pnpm yarn
10.5 s 7.4 s
@zkochan
Copy link
Member

zkochan commented Jan 29, 2021

I don't understand this. You probably mean symlinks, not hard links. It is not possible to create a hard link of a directory. And pnpm only creates symlinks for directories, so I don't understand how you got symlinked files as on the screenshot.

@aminya aminya changed the title Only hardlink the folder instead of hardlinking every single file Only symlink the folder instead of hardlinking every single file Jan 29, 2021
@aminya
Copy link
Author

aminya commented Jan 29, 2021

Yes. I was loose with my wording. I am proposing to symlink the top-level folder to the actual files. Currently, the folders are symlinked and inside the folder, there are hardlinks again.

For example, entering this symlinked folder
image

Gives the hardlinked files
image

Instead of the real files.

@zkochan
Copy link
Member

zkochan commented Jan 29, 2021

But hard links are real files. They are just pointing to the same location on the disk. I don't understand why your file explorer marks those files with some arrows. To any program, a hard link is just a file. A regular file.

@aminya
Copy link
Author

aminya commented Jan 29, 2021

Maybe, I am not explaining myself correctly. My point is that there should be only "one indirection", and that is the folder symlink. Currently, even inside the symlinked folder there are new hardlinks.

When I symlink a folder, this is how it looks:
image

Inside the symlinked folder:
image

Inside the real folder:
image

@zkochan
Copy link
Member

zkochan commented Jan 29, 2021

Here's a symlinked directory created by pnpm on my Windows machine, looks the same:

image

I don't understand how you got something that looks like this:
image

@zkochan
Copy link
Member

zkochan commented Jan 29, 2021

Symlink above, real location below:

image

@aminya
Copy link
Author

aminya commented Jan 29, 2021

I just use pnpm i.

Here is an example:

image

package.json inside the folder is hardlinked:

image

@zkochan
Copy link
Member

zkochan commented Jan 29, 2021

ok, I see, your explorer UI marks hardlinks. OK. This is correct.

But this is not an issue because there is "one indirection". Node.js will resolve the symlinked directory to its real location and will resolve the file. Node.js is not looking for other locations of a hard link.

@aminya
Copy link
Author

aminya commented Jan 29, 2021

Not sure about Node's fs.readlpath.native, but at least the function that Parcel uses for resolving has overhead.

@mischnic might know better.


BTW, to get your explorer like me, install the link shell extension:
https://schinagl.priv.at/nt/hardlinkshellext/hardlinkshellext.html#download

@aminya
Copy link
Author

aminya commented Apr 13, 2021

Since pnpm has bumped up the major version, I did another benchmark of Parcel builds for solid-simple-table. Using a yarn bootstrapped repository results in ~4s faster build times compared to pnpm bootstrapped repo.

Build time:

pnpm yarn
13.6 s 9.18 s

Maybe this is solely a Parcel issue as they don't seem to use fs.readlpath.native, but maybe it is a pnpm issue.

@DanielRios549
Copy link

I think this should be the default behavior, the folder node_modules/.pnpm should not exists, or if exists, all the folder inside it should be symlinks do the original folders on the global store at ~/.pnpm-store, this global store is where all the files should be store.

@jedwards1211
Copy link

jedwards1211 commented Jun 14, 2021

@aminya @DanielRios549 symlinking only the folders will not work, it would cause problems with deduping module instances.

For example, in a React project, all components need to use the same instance of the react package, otherwise you'll get errors.

Let's say you have two projects:

foo

  "react": "16.8.0",
  "my-react-comp": "1.0.0",

bar

  "react": "17.0.0",
  "my-react-comp": "1.0.0"

Inside my-react-comp, and inside both projects' own source code, there are modules that require('react').

With the behavior you want, foo/node_modules/my-react-comp and bar/node_modules/my-react-comp would have to be symlinks to the same directory within ~/.pnpm-store, hence they would both be using the same version of react. Assume that version is 16.8.0. Then the foo project would work fine, but in the bar project, my-react-comp would be loading react 16.8.0 -- different from the project's own version 17.0.0 that it uses elsewhere. bar's UI would fail to work at runtime.

The only way to avoid this is for pnpm to hardlink the files for my-react-comp to the store rather than the entire directory, so that the files can be shared between both projects, but my-react-comp can require a different version of react in foo than it does in bar.

@Clashsoft
Copy link

The current behaviour is actually problematic with Angular, in particular ngcc, which writes new files into the packages. These new files don't end up in the global store and thus need to be generated anew all the time, especially on CI.

@everflux
Copy link

Avoiding symlinks would allow to use pnpm in a scenario with a container based build and mounting a global cache to the build container.
symlinks can cross devices while hard links can not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants