Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[meta] ship only production files #324

Closed
wants to merge 1 commit into from
Closed

Conversation

osher
Copy link

@osher osher commented May 5, 2024

this will give you ~90% reduction of your on-disk node_modules footprint (from 644K to 65K), and your package size from 35K to 9.3K.

Good work with the test, bud, but don't ship them...

Here's your current status:

 $ du -h node_modules/resolve
4.0K    node_modules/resolve/test/pathfilter/deep_ref
8.0K    node_modules/resolve/test/pathfilter
8.0K    node_modules/resolve/test/dotdot/abc
16.0K   node_modules/resolve/test/dotdot
12.0K   node_modules/resolve/test/precedence/aaa
8.0K    node_modules/resolve/test/precedence/bbb
32.0K   node_modules/resolve/test/precedence
12.0K   node_modules/resolve/test/resolver/baz
16.0K   node_modules/resolve/test/resolver/nested_symlinks/mylib
20.0K   node_modules/resolve/test/resolver/nested_symlinks
8.0K    node_modules/resolve/test/resolver/multirepo/packages/package-b
12.0K   node_modules/resolve/test/resolver/multirepo/packages/package-a
24.0K   node_modules/resolve/test/resolver/multirepo/packages
36.0K   node_modules/resolve/test/resolver/multirepo
12.0K   node_modules/resolve/test/resolver/dot_main
8.0K    node_modules/resolve/test/resolver/without_basedir
12.0K   node_modules/resolve/test/resolver/symlinked/package
4.0K    node_modules/resolve/test/resolver/symlinked/_/symlink_target
4.0K    node_modules/resolve/test/resolver/symlinked/_/node_modules
12.0K   node_modules/resolve/test/resolver/symlinked/_
28.0K   node_modules/resolve/test/resolver/symlinked
12.0K   node_modules/resolve/test/resolver/incorrect_main
8.0K    node_modules/resolve/test/resolver/false_main
8.0K    node_modules/resolve/test/resolver/browser_field
4.0K    node_modules/resolve/test/resolver/other_path/lib
8.0K    node_modules/resolve/test/resolver/other_path
8.0K    node_modules/resolve/test/resolver/same_names/foo
16.0K   node_modules/resolve/test/resolver/same_names
8.0K    node_modules/resolve/test/resolver/invalid_main
8.0K    node_modules/resolve/test/resolver/quux/foo
12.0K   node_modules/resolve/test/resolver/quux
12.0K   node_modules/resolve/test/resolver/dot_slash_main
212.0K  node_modules/resolve/test/resolver
8.0K    node_modules/resolve/test/module_dir/ymodules/aaa
12.0K   node_modules/resolve/test/module_dir/ymodules
12.0K   node_modules/resolve/test/module_dir/zmodules/bbb
16.0K   node_modules/resolve/test/module_dir/zmodules
8.0K    node_modules/resolve/test/module_dir/xmodules/aaa
12.0K   node_modules/resolve/test/module_dir/xmodules
44.0K   node_modules/resolve/test/module_dir
4.0K    node_modules/resolve/test/shadowed_core/node_modules/util
8.0K    node_modules/resolve/test/shadowed_core/node_modules
12.0K   node_modules/resolve/test/shadowed_core
8.0K    node_modules/resolve/test/node_path/x/ccc
8.0K    node_modules/resolve/test/node_path/x/aaa
20.0K   node_modules/resolve/test/node_path/x
8.0K    node_modules/resolve/test/node_path/y/ccc
8.0K    node_modules/resolve/test/node_path/y/bbb
20.0K   node_modules/resolve/test/node_path/y
44.0K   node_modules/resolve/test/node_path
512.0K  node_modules/resolve/test
8.0K    node_modules/resolve/.github
56.0K   node_modules/resolve/lib
8.0K    node_modules/resolve/bin
12.0K   node_modules/resolve/example
644.0K  node_modules/resolve

if you ship only index.js, lib and bin....

here's the impact on the package size:

$ ls -lah *.tgz
-rw-r--r-- 1 osher osher  25K May  5 20:05 resolve-2.0.0-next.5.before.tgz
-rw-r--r-- 1 osher osher 9.3K May  5 20:07 resolve-2.0.0-next.5.tgz

this will give you ~90% reduction of your  on-disk node_modules footprint (from 644K to 65K).

Good work with the test, bud, but don't ship them...
Copy link

New and removed dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher

🚮 Removed packages: npm/in-publish@2.0.1, npm/mv@2.1.1, npm/npmignore@0.3.1, npm/safe-publish-latest@2.0.0, npm/tap@0.4.13, npm/tape@5.7.5, npm/tmp@0.0.31

View full report↗︎

Copy link
Member

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) the "files" field is dangerous and should never be used; this is what npmignore is for
b) the tests are quite intentionally shipped and will forever be.

@ljharb
Copy link
Member

ljharb commented May 6, 2024

Duplicate of #308. Duplicate of #296. Duplicate of #281. Duplicate of #279. Duplicate of #239. Duplicate of #235. Duplicate of #58. Duplicate of #44. See #105 (comment).

@ljharb ljharb closed this May 6, 2024
@osher
Copy link
Author

osher commented May 7, 2024

@ljharb
Can you please explain why should a nodejs user of a package have its tests in production node_modules?

Mind that vanilla-js non-bundling users are a great part of the users.
Mind that also people who bundle often ship the node_modules of install --production

P.S:
Sir, I would also love to learn why you believe that files section is dangerous.
I did see what you wrote here: ljharb/qs#504, but I still don't fully see to your purpose...
main is a single file, which could require/import more... what am I missing?

Thanks in advance
:)

@osher osher deleted the patch-1 branch May 7, 2024 13:25
@ljharb
Copy link
Member

ljharb commented May 7, 2024

Because npm explore foo && npm install && npm test should always work.

files is dangerous because if you forget to include a file, the consequence is that users are broken; the consequence of forgetting a file in npmignore is that the package is a tiny bit larger.

@osher
Copy link
Author

osher commented May 7, 2024

Thanks for your answer - I learnt something new - npm explore (and I've been using nodejs since 0.6.0...)

I'm happy to meet a real person!

Would you mind to help me understand the use-case where a package consumer would want to run the packages tests from their production node_modules? donno, troubleshooting compatibilty? and how many really get to need to do that?

I'll tell you how I got here, LMK what you think.

I'm on a personal quest to reduce the size of node_modules - at least in production distributions.
(but I'm looking for a simpler way than getting everybody on a bundler... 😛)
The node_modules heavier than black-hole was funny for a start, but somebody has to do something about it.
I came to think it's the ocean plastic patch of the node community...

IMHO there quite a few different reasons. I just started my research, meeting the people and understand them.
Some cases it's neglect, some cases it's plastic-wrap culture, and in rare cases - I get to meet opinons - so yea - nice to meet! :)

Anyway, IMHO - Big footprint is not only disk, it's also network. And CPU. Not only on developer local -machines,
but it repeats itself over and over in recurring CI docker-imageas, npm-caches, and more. It's a rabbit-hole, really...

You see, in the end of the day it's money and CO2.

. . .

I got to resolve because it appears to land quite a few times in a backstage out-of-the-box project (that's how it all started... I saw that upgrading backstage from the old architecture to the new doubles it's node_modules size, and I started to peek in...). I got to your other project and saw the fingerprints, did not connect it's the same person, thoughts it's two people using the same toolchian... :)

Thanks for your answers, I'll appreciate having your honest opinion.
If you have already written posts about it - link me, I'll definately read :)

Thanks again.

@ljharb
Copy link
Member

ljharb commented May 8, 2024

The use case is one I have all the time - I'm offline, and I want to run the tests on an installed dependency, to determine if it's behaving correctly in my node version.

I'm on a personal quest to reduce the size of node_modules

I think this is not a valuable effort; disk space is effectively infinite and free. Additionally, the party who could most impact this (or CPU, or bandwith) is npm itself, and if they don't care about solving this problem, nobody else is going to be able to make a dent in it.

at least in production distributions.

For browsers, the solution is indeed to just use a bundler. For servers, while bundling often causes faster startup time and is worth exploring, npm prune --production is a first step, and using code coverage data to remove unused files seems like a perfectly reasonable additional step - this would clean up all the markdown files that always ship with every package but never are used in production, for example, and most importantly, it wouldn't require any effort whatsoever on the part of unpaid OSS maintainers.

node_modules should be largely treated as a black box, and if you're focusing on dependency count or disk space, you're wasting effort that would be better spent on more important problems for your business.

@osher
Copy link
Author

osher commented May 9, 2024

Jordan,

Thank you for your respectful answer, I was afraid for a moment, but had to do ask. You gain my respect 👍

npm prune --production is a first step

Well, ths startpoint of the discussion is a --production installation - which as far as I understand it - is already pruned :)
Please do correct me if I'm wrong - it will have implications on my CIs and the CI practices I recommend my clients...

I want to run the tests on an installed dependency, to determine if it's behaving correctly in my node version.

Yes!! I get to do that. I definitely been there, although not as often as you say, and definitely even less when I'm off-line.

I found that when using cd node_modules/<suspect-pkg> && npm i && npm test the npm i places shadowing dependencies in the node_modules of the consumer project, so the tests will pass like they did in CI without any help to detect the problem. I'll have to npm ls <suspect-mutual-dup> and fiddle with the resulting node_modules of suspect-pkg... 🙄

(At least, that's how it used to be before workspaces feature hit the market, maybe my old opinions need updating.
Here too - I'd love to hear if I'm wrong).

🤔 woups. when I say " like they did in CI" - I am assume the CI tests for all the declared supported node versions.
🤨 I do see the point here - there's lot of CPU in this test matrix. What's cheaper, pre-relesae testing or post-release troubleshooting.... clearly it gets to moral philosophy, and there are many ways to tackle a problem...

🤔 I'm still on the pre-release test side, but I'm keeping an open mind...

and if they don't care about solving this problem, nobody else is going to be able to make a dent in it

I agree that npm has a significant role in that.
But I'm afraid that I also agree in what you point - solving that elegantly would require changes in the npm client<->server protocol, or changes in the pack.tgz<->npm-client protocol.

The thing is, my personal development roadmap in past years took me through the ops perspective - so I'm seeing things from that perspective as well as developers. I have become aware of the bill, which made me aware of the waste. And since cloud bill is also a proxy for CO2 footprint - that made me ashamed 😞.

True, kick-starts would rather throw money at the problem and move fast. But I would love to believe that after this many years of node we'd have better practices to offer...

You know what, let me show you the worst case I met in my showcase of the node_modules of backstage, LMK what you think of it:
This package gets a 97% reduction if it did not include tests:
vadimg/js_bintrees#38

Do we really need this plastic wrap/plastic straw?

make a dent in it

Well, I have already made a dent, although it's hard to show with openly available data because I was working mainly with proprietary ecosystems. Thus, I acknowledged that I turned my efforts in the wrong direction.
I now turned my effort to the open source projects after a long neglect, and there's so much to do...

Jordan, clearly, I have a lot to gain from understanding your perspective better.
Where can I buy you a drink? I suspect we're communicating across a big time-zone gap... how mobile are you? Are you traveling to international conventions?

If you'd like to understand my perspective better - here are two IMHO fun reads:

Can you think of a better medium we can continue, or you like it just fine here?

Osher

P.S: - since you're interested in tests, consider that you might have created a set of base-assumptions that let you ship an untested package, instead of actually testing it.

Let me explain:

#file: ./automation/pack-test.sh
#!/bin/sh
set -e
set -- $(jq -r '.name + " " + .version' package.json)
rm -f $1-$2.tgz tmp
npm pack
$(mkdir -p tmp/node_modules && cd tmp && npm init -y && npm i -y && npm i $1-$2.tgz);

With a little more effort it can be done in a cross-platform manner, but I trust you get the spirit ;)

‼️ BTW, this will not cover packages that load modules dynamically. That requires a different treatment...

Obviously, this can be included it in the JS test suite, duping load from CI machines to developer machines.

const { exec } = require('node:child_process');

describe('project package', () => {
  it('should result with a working package', (done) {
     exec('./automation/pack-test.sh', done);
  });
});

woups. ah - I work in the assumption that only CI machines can publish... 🤔 which may not be true for kickstart teams... 🤔

🤔 One more thought:

Do you believe that npm test should run ALL tests, including e2e or is it OK for your perspective to have npm test that should run everywhere and npm test:e2e that does not have to run everywhere?

@ljharb
Copy link
Member

ljharb commented May 9, 2024

Publishing almost always happens on a local machine; it's not currently practical to publish from CI with 2FA (automation tokens are explicitly one factor).

All of my projects, this one included, run tests on the npm pack output in CI, so there's no coverage gap there.

Regarding your linked articles, the first one only applies if you have compiled dependencies; I do not. The second talks about lockfiles, which I very intentionally never use, avoiding the problems mentioned.

I'm happy to keep discussing in another medium, especially since most of this is off topic - find me on Slack, IRC, Discord, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants