Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(common): add retry checks around npm install (and any other npm network activities) #10350

Closed
mcdurdin opened this issue Jan 10, 2024 · 10 comments · Fixed by #11451
Closed
Assignees
Labels
chore ci Issues relating to build infrastructure common/
Milestone

Comments

@mcdurdin
Copy link
Member

@jahorton can you figure out why the Test: Language Modeling Layer (Common) build failed? https://build.palaso.org/viewLog.html?buildId=433516&buildTypeId=Keyman_Common_LMLayer_TestPullRequests

01:52:46
npm ERR! npm ERR! errno ECONNRESET

01:52:46
npm ERR! npm ERR! network Invalid response body while trying to fetch https://registry.npmjs.org/@parcel%2flogger: aborted

01:52:46
npm ERR! npm ERR! network This is a problem related to network connectivity.

01:52:46
npm ERR! npm ERR! network In most cases you are behind a proxy or have bad network settings.

This has been happening from time to time for the past few months. Not necessarily on that specific package - just the loss of connectivity when trying to retrieve a package.

@mcdurdin mcdurdin added this to the 18.0 milestone Jan 10, 2024
@mcdurdin mcdurdin added the ci Issues relating to build infrastructure label Jan 16, 2024
@mcdurdin mcdurdin modified the milestones: 18.0, A18S1 Apr 19, 2024
@jahorton
Copy link
Contributor

jahorton commented Apr 23, 2024

I did a little searching related to this and found something that may be of interest... but it does currently have limitations.

With recent-enough versions of npm, there's now a way to cache recent npm installs and prioritize use of the cached packages, rather than always going out and re-fetching them.

But... there's a bug in which version-bumping a package causes use of the related option to fail, despite saying prefer cache (rather than 'use cache exclusively'):

@jahorton
Copy link
Contributor

jahorton commented Apr 24, 2024

Very relevant thread I found:

actions/runner-images#3737

This comment in particular looks relevant: actions/runner-images#3737 (comment)

It's worth noting that npm just closed a bug (pending release) where too many connections were being opened during install:

@mcdurdin
Copy link
Member Author

mcdurdin commented May 2, 2024

npm 10.5.1. I think we should go ahead and update our build agents to this latest version pronto -- before we try to do 17.0-stable release. Failing due to ECONNRESET is frequent now.

@mcdurdin
Copy link
Member Author

mcdurdin commented May 3, 2024

I am doing npm 10.5.1 upgrade on all build agents now. Not going @latest until after 17.0-stable releases. Just mitigating ECONNRESET bug.

  • ba-win10-64-s1-601
  • ba-win10-64-pp-602
  • ba-bionic-64-ta
  • ba-jammy-64-ta
  • ba-macos-keyman-1

@mcdurdin
Copy link
Member Author

mcdurdin commented May 8, 2024

Note, after applying the update to 10.5.1, we still get ECONNRESET, e.g. on ba-bionic-64-ta (https://build.palaso.org/buildConfiguration/Keyman_Test_Common_Linux/460549):

11:08:45   [common/web/keyman-version] ## configure starting...
11:08:48   npm WARN skipping integrity check for git dependency ssh://git@github.com/keymanapp/dependency-node-xml2js.git
11:08:48   npm WARN skipping integrity check for git dependency ssh://git@github.com/keymanapp/dependency-restructure.git
11:08:48   npm WARN skipping integrity check for git dependency ssh://git@github.com/keymanapp/dependency-restructure.git
11:08:52   npm WARN deprecated @npmcli/move-file@2.0.1: This functionality has been moved to @npmcli/fs
11:09:31   npm ERR! code 1
11:09:31   npm ERR! git dep preparation failed
11:09:31   npm ERR! command /home/bob/.nvm/versions/node/v18.16.0/bin/node /home/bob/.nvm/versions/node/v18.16.0/lib/node_modules/npm/bin/npm-cli.js install --force --cache=/home/bob/.npm --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run
11:09:31   npm ERR! npm WARN using --force Recommended protections disabled.
11:09:31   npm ERR! npm ERR! code ECONNRESET
11:09:31   npm ERR! npm ERR! errno ECONNRESET
11:09:31   npm ERR! npm ERR! network Invalid response body while trying to fetch https://registry.npmjs.org/@parcel%2fpackager-html: aborted
11:09:31   npm ERR! npm ERR! network This is a problem related to network connectivity.
11:09:31   npm ERR! npm ERR! network In most cases you are behind a proxy or have bad network settings.
11:09:31   npm ERR! npm ERR! network
11:09:31   npm ERR! npm ERR! network If you are behind a proxy, please make sure that the
11:09:31   npm ERR! npm ERR! network 'proxy' config is set properly.  See: 'npm help config'
11:09:31   npm ERR!
11:09:31   npm ERR! npm ERR! A complete log of this run can be found in:
11:09:31   npm ERR! npm ERR!     /home/bob/.npm/_logs/2024-05-07T04_09_02_790Z-debug-0.log
11:09:31   
11:09:31   npm ERR! A complete log of this run can be found in:
11:09:31   npm ERR!     /home/bob/.npm/_logs/2024-05-07T04_08_46_540Z-debug-0.log
11:09:31   [common/web/keyman-version] ## configure failed

The build is picking up node 18.16.0:

11:09:31   npm ERR! command /home/bob/.nvm/versions/node/v18.16.0/bin/node  [snip]

But node 18.19.0 is current with nvm:

bob@ba-bionic-64-ta:~$ nvm current
v18.19.0
bob@ba-bionic-64-ta:~$ npm --version
10.5.1
bob@ba-bionic-64-ta:~$ which npm
/home/bob/.nvm/versions/node/v18.19.0/bin/npm
bob@ba-bionic-64-ta:~$ /home/bob/.nvm/versions/node/v18.16.0/bin/node /home/bob/.nvm/versions/node/v18.16.0/lib/node_modules/npm/bin/npm-cli.js --version
9.5.1
bob@ba-bionic-64-ta:~$

Because ... $PATH is defined in buildAgent.properties:

env.PATH=/home/bob/.nvm/versions/node/v18.16.0/bin:/usr/local/bin:/usr/bin:/bin:/usr/lib/android-sdk/cmdline-tools/tools/bin

@darcywong00 darcywong00 modified the milestones: A18S1, A18S2 May 11, 2024
@mcdurdin
Copy link
Member Author

mcdurdin commented May 13, 2024

So, just experienced ECONNRESET on ba-jammy-64-ta, which is on npm 10.5.1:

07:26:58   npm WARN skipping integrity check for git dependency ssh://git@github.com/keymanapp/dependency-node-xml2js.git
07:26:59   npm WARN skipping integrity check for git dependency ssh://git@github.com/keymanapp/dependency-restructure.git
07:26:59   npm WARN skipping integrity check for git dependency ssh://git@github.com/keymanapp/dependency-restructure.git
07:27:01   npm WARN deprecated @npmcli/move-file@2.0.1: This functionality has been moved to @npmcli/fs
07:27:58   npm ERR! code 1
07:27:58   npm ERR! git dep preparation failed
07:27:58   npm ERR! command /usr/bin/node /usr/lib/node_modules/npm/bin/npm-cli.js install --force --cache=/home/bob/.npm --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run
07:27:58   npm ERR! npm WARN using --force Recommended protections disabled.
07:27:58   npm ERR! npm ERR! code ECONNRESET
07:27:58   npm ERR! npm ERR! errno ECONNRESET
07:27:58   npm ERR! npm ERR! network Invalid response body while trying to fetch https://registry.npmjs.org/@parcel%2freporter-tracer: aborted
07:27:58   npm ERR! npm ERR! network This is a problem related to network connectivity.
07:27:58   npm ERR! npm ERR! network In most cases you are behind a proxy or have bad network settings.
07:27:58   npm ERR! npm ERR! network
07:27:58   npm ERR! npm ERR! network If you are behind a proxy, please make sure that the
07:27:58   npm ERR! npm ERR! network 'proxy' config is set properly.  See: 'npm help config'
07:27:58   npm ERR!
07:27:58   npm ERR! npm ERR! A complete log of this run can be found in: /home/bob/.npm/_logs/2024-05-13T00_27_03_752Z-debug-0.log

And verifying the version, making sure we are using exactly the same call to npm as in the failed call above:

~$ /usr/bin/node /usr/lib/node_modules/npm/bin/npm-cli.js --version
10.5.1

@jahorton
Copy link
Contributor

Dang, even still? Thought for sure there'd be some relation there.

@jahorton
Copy link
Contributor

There are a disturbing amount of StackOverflow answers (such as https://stackoverflow.com/questions/71449279/how-to-resolve-npm-err-code-econnreset-while-installing-angular-cli) saying "just rewrite npm's registry to use the http:// version of the registry site instead of the https:// version." That's obviously not something we want in our CI setup, though.

I don't see anything (yet) about a ECONNRESET-specific exit code for npm, but even then, if this is the only reason we fall over during CI with npm ci... we can probably just set a temporary trap for npm errors. The issue is... we don't want to drop the already-existing error trap. https://stackoverflow.com/a/7287873, with interpretation, seems to provide a way forward to "trap juggle" by "storing" the original trap. We could capture, then unset the old trap... and then put it back in place once done.

@mcdurdin
Copy link
Member Author

I don't see anything (yet) about a ECONNRESET-specific exit code for npm, but even then, if this is the only reason we fall over during CI with npm ci... we can probably just set a temporary trap for npm errors. The issue is... we don't want to drop the already-existing error trap. https://stackoverflow.com/a/7287873, with interpretation, seems to provide a way forward to "trap juggle" by "storing" the original trap. We could capture, then unset the old trap... and then put it back in place once done.

Just use || on the npm ci line:

npm ci || (...do the retry bits)

@mcdurdin
Copy link
Member Author

npm on ba-win10-64-pp-602 was upgraded to 10.5.1 on 3 May 2024. On 9 May 2024, we got another failed build https://build.palaso.org/buildConfiguration/Keyman_Developer_Test/461792?buildTab=log&linesState=374&logView=flowAware&focusLine=5532

14:27:50   npm ERR! command C:\Program Files\nodejs\node.exe C:\Users\bob\AppData\Roaming\nvm\v18.17.0\node_modules\npm\bin\npm-cli.js install --force --cache=C:\Users\bob\AppData\Local\npm-cache --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run
14:27:50   npm ERR! npm WARN using --force Recommended protections disabled.
14:27:50   npm ERR! npm ERR! code ECONNRESET
14:27:50   npm ERR! npm ERR! errno ECONNRESET
14:27:50   npm ERR! npm ERR! network Invalid response body while trying to fetch https://registry.npmjs.org/@parcel%2fconfig-default: aborted
14:27:50   npm ERR! npm ERR! network This is a problem related to network connectivity.
14:27:50   npm ERR! npm ERR! network In most cases you are behind a proxy or have bad network settings.
14:27:50   npm ERR! npm ERR! network
14:27:50   npm ERR! npm ERR! network If you are behind a proxy, please make sure that the
14:27:50   npm ERR! npm ERR! network 'proxy' config is set properly.  See: 'npm help config'
14:27:50   npm ERR!
14:27:50   npm ERR! npm ERR! A complete log of this run can be found in: C:\Users\bob\AppData\Local\npm-cache\_logs\2024-05-10T07_25_11_553Z-debug-0.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore ci Issues relating to build infrastructure common/
Development

Successfully merging a pull request may close this issue.

3 participants