Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

citgm-smoker fails on test-ibm-rhel7-s390x-2 with "not found: npm" with npm 7.4.2 #2525

Closed
richardlau opened this issue Jan 20, 2021 · 13 comments · Fixed by npm/arborist#217
Closed

Comments

@richardlau
Copy link
Member

Since the npm 7.4.2 update landed in Node.js (nodejs/node#36953) the citgm-smoker job has been failing on test-ibm-rhel7-s390x-2 with error:

error: failure             | not found: npm

The citgm-smoker job:

  1. Builds Node.js
  2. Installs the built Node.js into a clean (i.e. removed beforehand) smoker directory
  3. Runs npm install --prefix=/home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker --global --loglevel=error citgm to install citgm
  4. Runs citgm-all -J

Further investigation shows that although the symlinks for npm are in smoker/bin the targets are not there:

[iojs@test-ibm-rhel7-s390x-2 smoker]$ ls -al
total 40
drwxr-xr-x 6 iojs iojs  4096 Jan 19 09:52 .
drwxr-xr-x 5 iojs iojs  4096 Jan 19 09:52 ..
drwxr-xr-x 2 iojs iojs  4096 Jan 19 09:52 bin
drwxr-xr-x 3 iojs iojs  4096 Jan 19 09:52 include
drwxr-xr-x 3 iojs iojs  4096 Jan 19 09:52 lib
-rw-r--r-- 1 iojs iojs 14690 Jan 19 09:52 report.xml
drwxr-xr-x 5 iojs iojs  4096 Jan 19 09:52 share
[iojs@test-ibm-rhel7-s390x-2 smoker]$ ls -al bin/
total 76664
drwxr-xr-x 2 iojs iojs     4096 Jan 19 09:52 .
drwxr-xr-x 6 iojs iojs     4096 Jan 19 09:52 ..
lrwxrwxrwx 1 iojs iojs       38 Jan 19 09:52 citgm -> ../lib/node_modules/citgm/bin/citgm.js
lrwxrwxrwx 1 iojs iojs       42 Jan 19 09:52 citgm-all -> ../lib/node_modules/citgm/bin/citgm-all.js
-rwxr-xr-x 1 iojs iojs 78492160 Jan 19 09:52 node
lrwxrwxrwx 1 iojs iojs       38 Jan 19 09:52 npm -> ../lib/node_modules/npm/bin/npm-cli.js
lrwxrwxrwx 1 iojs iojs       38 Jan 19 09:52 npx -> ../lib/node_modules/npm/bin/npx-cli.js
[iojs@test-ibm-rhel7-s390x-2 smoker]$ ls -al lib/node_modules/
total 16
drwxr-xr-x 4 iojs iojs 4096 Jan 19 09:52 .
drwxr-xr-x 3 iojs iojs 4096 Jan 19 09:52 ..
drwxr-xr-x 2 iojs iojs 4096 Jan 19 09:52 .bin
drwxr-xr-x 5 iojs iojs 4096 Jan 19 09:52 citgm
[iojs@test-ibm-rhel7-s390x-2 smoker]$

In fact the entire npm directory is missing from smoker/lib/node_modules. A look at the logs suggests npm has unexpectedly removed a lot of packages at the install citgm step:
e.g. https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/2574/nodes=rhel7-s390x/console

14:52:40 + npm install --prefix=/home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker --global --loglevel=error citgm
14:52:40 
14:52:48 added 209 packages, removed 251 packages, and audited 210 packages in 8s
14:52:48 
14:52:48 15 packages are looking for funding
14:52:48   run `npm fund` for details
14:52:48 
14:52:48 found 0 vulnerabilities

Further investigation has revealed:

Only occurring on test-ibm-rhel7-s390x-2 would suggest perhaps a machine specific issue but so far I've not been able to find an obvious difference between it and the other non-affected machines. As the problem only occurs on commits after the npm 7.4.2 update it could also be a regression in npm, but we're not seeing the problem elsewhere.

I'm going to take test-ibm-rhel7-s390x-2 offline temporarily.

cc @nodejs/releasers @nodejs/npm

@richardlau
Copy link
Member Author

FWIW with loglevel silly we can see:
With nodejs/node@1f287b0, https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/2583/nodes=rhel7-s390x/console

19:11:35 npm timing idealTree:buildDeps Completed in 9024ms
19:11:35 npm timing idealTree:fixDepFlags Completed in 3ms
19:11:35 npm timing idealTree Completed in 9034ms
19:11:35 npm timing reify:loadTrees Completed in 9156ms
19:11:35 npm sill reify mark deleted [
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/npm',
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/npm',
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/npx'
19:11:35 npm sill reify ]
19:11:35 npm timing reify:diffTrees Completed in 8ms
19:11:35 npm sill reify mark retired [
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/npm',
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/npm',
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/npx'
19:11:35 npm sill reify ]
19:11:35 npm sill reify moves {
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/npm': '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.npm-XSSEkfPA',
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/npm': '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/.npm-by5McUm2',
19:11:35 npm sill reify   '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/npx': '/data/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/.npx-80vfLtlW'
19:11:35 npm sill reify }
19:11:35 npm timing reify:retireShallow Completed in 4ms
19:11:35 npm timing reify:createSparse Completed in 23ms

whereas with nodejs/node@f8ab632 (commit before npm 7.4.2 update), https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/2582/nodes=rhel7-s390x/consoleFull:

18:10:56 npm timing idealTree:buildDeps Completed in 7652ms
18:10:56 npm timing idealTree:fixDepFlags Completed in 3ms
18:10:56 npm timing idealTree Completed in 7662ms
18:10:56 npm timing reify:loadTrees Completed in 7778ms
18:10:56 npm timing reify:diffTrees Completed in 2ms
18:10:56 npm sill reify moves {}
18:10:56 npm timing reify:retireShallow Completed in 0ms
18:10:56 npm timing reify:createSparse Completed in 19ms

@MylesBorins
Copy link
Member

/cc @nodejs/npm as a heads up!

@wraithgar
Copy link

Is /home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker itself an actual folder or is it a symlink?

@richardlau
Copy link
Member Author

A folder.

@richardlau
Copy link
Member Author

FWIW I temporarily reenabled the host and kicked off a CI for nodejs/node#37018 (npm 7.4.3): https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/2587/nodes=rhel7-s390x/console
Problem persists.

@ruyadorno
Copy link
Member

@richardlau we can try to book some time to pair on it next week if you're available 😊

@richardlau
Copy link
Member Author

Is /home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker itself an actual folder or is it a symlink?

A folder.

Actually digging further it looks like /home/iojs is a symlink:

[iojs@test-ibm-rhel7-s390x-2 /]$ ls -al home
total 24
drwxr-xr-x.  4 root   root    4096 Dec 13  2019 .
dr-xr-xr-x. 19 root   root    4096 Dec  9  2019 ..
lrwxrwxrwx   1 root   root      10 Dec 13  2019 iojs -> /data/iojs
drwx------.  3 linadm linadm  4096 Aug 23  2016 linadm
drwx------.  7 linux1 linux1 12288 Apr  7  2020 linux1
[iojs@test-ibm-rhel7-s390x-2 /]$

whereas it's a real directory on the other three LinuxONE hosts. I'm not sure why there's a discrepency (@AshCripps would you happen to know?).

I saw there were some fixes in this area in npm 7.5.0 (nodejs/node#37117) so I started another citgm-smoker run on test-ibm-rhel7-s390x-2 with npm 7.5.0: https://ci.nodejs.org/job/citgm-smoker/2605/nodes=rhel7-s390x/console
That run doesn't delete npm (🎉) but it looks like it's created the binaries for citgm in lib/node_modules/.bin instead of bin:

  • On test-ibm-rhel7-s390x-2 (bad):
[iojs@test-ibm-rhel7-s390x-2 smoker]$ ls -al /home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/bin/
total 76736
drwxr-xr-x 2 iojs iojs     4096 Jan 29 12:33 .
drwxr-xr-x 6 iojs iojs     4096 Jan 29 12:33 ..
-rwxr-xr-x 1 iojs iojs 78568000 Jan 29 12:33 node
lrwxrwxrwx 1 iojs iojs       38 Jan 29 12:33 npm -> ../lib/node_modules/npm/bin/npm-cli.js
lrwxrwxrwx 1 iojs iojs       38 Jan 29 12:33 npx -> ../lib/node_modules/npm/bin/npx-cli.js
[iojs@test-ibm-rhel7-s390x-2 smoker]$ ls -al /home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/
total 8
drwxr-xr-x 2 iojs iojs 4096 Jan 29 12:33 .
drwxr-xr-x 5 iojs iojs 4096 Jan 29 12:33 ..
lrwxrwxrwx 1 iojs iojs   21 Jan 29 12:33 citgm -> ../citgm/bin/citgm.js
lrwxrwxrwx 1 iojs iojs   25 Jan 29 12:33 citgm-all -> ../citgm/bin/citgm-all.js
[iojs@test-ibm-rhel7-s390x-2 smoker]$
  • On test-ibm-rhel7-s390x-1 (as expected):
-bash-4.2$ ls -al /home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/bin/
total 73288
drwxr-xr-x 2 iojs iojs     4096 Jan 26 07:24 .
drwxr-xr-x 6 iojs iojs     4096 Jan 26 07:38 ..
lrwxrwxrwx 1 iojs iojs       38 Jan 26 07:24 citgm -> ../lib/node_modules/citgm/bin/citgm.js
lrwxrwxrwx 1 iojs iojs       42 Jan 26 07:24 citgm-all -> ../lib/node_modules/citgm/bin/citgm-all.js
-rwxr-xr-x 1 iojs iojs 75038624 Jan 26 07:24 node
lrwxrwxrwx 1 iojs iojs       38 Jan 26 07:24 npm -> ../lib/node_modules/npm/bin/npm-cli.js
lrwxrwxrwx 1 iojs iojs       38 Jan 26 07:24 npx -> ../lib/node_modules/npm/bin/npx-cli.js
-bash-4.2$ ls -al /home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/
ls: cannot access /home/iojs/build/workspace/citgm-smoker/nodes/rhel7-s390x/smoker/lib/node_modules/.bin/: No such file or directory
-bash-4.2$

@AshCripps
Copy link
Member

That is very odd - they should all be the same as I just ran the ansible scripts on them. It might be worth emailing them to get them to rebuild the machine/see if they can explain this?

@richardlau
Copy link
Member Author

richardlau commented Feb 1, 2021

That is very odd - they should all be the same as I just ran the ansible scripts on them. It might be worth emailing them to get them to rebuild the machine/see if they can explain this?

I'd like to hold off doing that right now as it looks like the current set up hits some sort of npm edge case (now we have identified an obvious difference between it and the other hosts).

@ruyadorno
Copy link
Member

Actually digging further it looks like /home/iojs is a symlink:

oooh that does rings a bell! I think we have some reports of installs in symlinked folders not working as expected and some fixes were shipped in 7.5.0 so that would make sense! ref: npm/arborist#210

@richardlau
Copy link
Member Author

Actually digging further it looks like /home/iojs is a symlink:

oooh that does rings a bell! I think we have some reports of installs in symlinked folders not working as expected and some fixes were shipped in 7.5.0 so that would make sense! ref: npm/arborist#210

👍. There's still some odd behaviour with npm 7.5.0 (citgm binaries appear to be put in the wrong place): #2525 (comment)

@isaacs
Copy link

isaacs commented Feb 1, 2021

This is an npm bug. Will be fixed in today's release.

isaacs added a commit to npm/arborist that referenced this issue Feb 1, 2021
Also fixes up a few other places where we were checking isRoot and
should have been checking isProjectRoot.

Fix: nodejs/build#2525
isaacs added a commit to npm/arborist that referenced this issue Feb 1, 2021
Also fixes up a few other places where we were checking isRoot and
should have been checking isProjectRoot.

Fix: nodejs/build#2525

PR-URL: #217
Credit: @isaacs
Close: #217
Reviewed-by: @nlf
@richardlau
Copy link
Member Author

https://ci.nodejs.org/job/citgm-smoker/2612/nodes=rhel7-s390x/ (Node.js master with npm 7.5.1) appears to be behaving as expected on test-ibm-rhel7-s390x-2 (i.e. the job has successfully installed citgm and is running tests).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants