Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] test-all-versions failing for @cucumber/cucumber@9.1.2 and @aws-sdk/client-s3@3.377.0 #1828

Closed
pichlermarc opened this issue Nov 27, 2023 · 9 comments · Fixed by #1838
Closed
Assignees
Labels
bug Something isn't working internal pkg:instrumentation-aws-sdk pkg:instrumentation-cucumber priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization

Comments

@pichlermarc
Copy link
Member

What happened?

The TAV script currently fails in CI only (tested locally but could not reproduce) for @opentelemetry/instrumentation-cucumber with @cucumber/cucumber@9.1.2.

It looks like it cannot resolve the package @cucumber/messages for some reason - when I run npm run test-all-versions locally, everything works fine.

Logs:

cc @Ugzuzg (component-owner)

@pichlermarc pichlermarc added bug Something isn't working internal priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization pkg:instrumentation-cucumber needs:reproducer This bug/feature is in need of a minimal reproducer labels Nov 27, 2023
@Ugzuzg
Copy link
Contributor

Ugzuzg commented Nov 27, 2023

Same. Works locally.

@pichlermarc
Copy link
Member Author

Looks like this affects more than just the cucumber instrumentation, having a similar problem over at an instrumentation-aws-sdk PR. I'll re-label the issue accordingly.

Sorry for the noise

@pichlermarc pichlermarc changed the title [instrumentation-cucumber] tests failing on main for @cucumber/cucumber@9.1.2 [ci] test-all-versions failing @cucumber/cucumber@9.1.2 and @aws-sdk/client-s3@3.377.0 Nov 28, 2023
@pichlermarc pichlermarc changed the title [ci] test-all-versions failing @cucumber/cucumber@9.1.2 and @aws-sdk/client-s3@3.377.0 [ci] test-all-versions failing for @cucumber/cucumber@9.1.2 and @aws-sdk/client-s3@3.377.0 Nov 28, 2023
@trentm
Copy link
Contributor

trentm commented Nov 29, 2023

I can reproduce the @cucumber/cucumber@9.1.2 issue locally. I'll add some details in a bit. I had a frustrating afternoon yesterday trying to understand the failing vs. not failing case.

@trentm
Copy link
Contributor

trentm commented Nov 29, 2023

Repro for the cucumber TAV (test-all-versions) failure.

tl;dr:

npm ci
npx lerna run --scope @opentelemetry/instrumentation-cucumber test-all-versions

With details:

$ cd .../opentelemetry-js-contrib  # top dir of your clone

$ npm ci   # start with a clean layout of node_modules/... dirs

# Observe the current install location of all the '@cucumber/...' packages. These aren't all 
# the deps of '@cucumber/cucumber', but enough to see the issue.
$ ls **/node_modules/@cucumber/*/package.json
node_modules/@cucumber/ci-environment/package.json
node_modules/@cucumber/cucumber-expressions/package.json
node_modules/@cucumber/cucumber/package.json
node_modules/@cucumber/gherkin-streams/package.json
node_modules/@cucumber/gherkin-utils/node_modules/@cucumber/gherkin/package.json
node_modules/@cucumber/gherkin-utils/node_modules/@cucumber/messages/package.json
node_modules/@cucumber/gherkin-utils/package.json
node_modules/@cucumber/gherkin/package.json
node_modules/@cucumber/html-formatter/package.json
node_modules/@cucumber/message-streams/package.json
node_modules/@cucumber/messages/package.json
node_modules/@cucumber/tag-expressions/package.json

# Run the TAV tests for cucumber rougly the way the 'test-all-versions.yml' action does.
$ npx lerna run --scope @opentelemetry/instrumentation-cucumber test-all-versions
...
-- required packages ["@cucumber/cucumber@9.1.2"]
-- installing ["@cucumber/cucumber@9.1.2"]
-- running test "npm test" with @cucumber/cucumber (env: {})
> @opentelemetry/instrumentation-cucumber@0.1.2 test
> nyc ts-mocha -p tsconfig.json 'test/**/*.test.ts'
Error: Cannot find module '@cucumber/messages'
Require stack:
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/@cucumber/gherkin-streams/dist/src/makeGherkinOptions.js
...

# Observer the install location changes now.
$ ls **/node_modules/@cucumber/*/package.json
node_modules/@cucumber/gherkin-streams/package.json
node_modules/@cucumber/gherkin-utils/node_modules/@cucumber/gherkin/package.json
node_modules/@cucumber/gherkin-utils/node_modules/@cucumber/messages/package.json
node_modules/@cucumber/gherkin-utils/package.json
node_modules/@cucumber/message-streams/package.json
node_modules/@cucumber/tag-expressions/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/ci-environment/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/cucumber-expressions/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/cucumber/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/gherkin/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/html-formatter/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/messages/package.json

# Those install locations mean import those deps are broken.
$ (cd plugins/node/instrumentation-cucumber; node -e "require('@cucumber/cucumber')")
node:internal/modules/cjs/loader:1031
  throw err;
  ^
Error: Cannot find module '@cucumber/messages'
...

Note, in case it matters: I'm doing my tests with node v16.20.2 (npm v8.19.4) -- for the unrelated reason that npm ci is MUCH faster for me with Node.js 16 than with Node.js 18 and I haven't had a chance to look into why.

when I run npm run test-all-versions locally, everything works fine.

Yup, it works this way.

$ npm ci
$ cd plugins/node/instrumentation-cucumber
$ npm run test-all-versions
...

$ cd ../../../
$ ls **/node_modules/@cucumber/*/package.json
node_modules/@cucumber/gherkin-streams/package.json
node_modules/@cucumber/gherkin-utils/node_modules/@cucumber/gherkin/package.json
node_modules/@cucumber/gherkin-utils/node_modules/@cucumber/messages/package.json
node_modules/@cucumber/gherkin-utils/package.json
node_modules/@cucumber/gherkin/package.json
node_modules/@cucumber/message-streams/package.json
node_modules/@cucumber/messages/package.json
node_modules/@cucumber/tag-expressions/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/ci-environment/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/cucumber-expressions/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/cucumber/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/gherkin/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/html-formatter/package.json
plugins/node/instrumentation-cucumber/node_modules/@cucumber/messages/package.json

In this case, the layout is slightly different. The main difference is that @cucumber/messages is in the top-level node_modules dir.

a slightly quicker repro

The failure reproduces with just that single cucumber version tested, so this change means the repro is faster, in case someone else is trying to debug this as well.

diff --git a/plugins/node/instrumentation-cucumber/.tav.yml b/plugins/node/instrumentation-cucumber/.tav.yml
index c1930c8c..bf4f19f3 100644
--- a/plugins/node/instrumentation-cucumber/.tav.yml
+++ b/plugins/node/instrumentation-cucumber/.tav.yml
@@ -1,3 +1,4 @@
 '@cucumber/cucumber':
-  versions: '^8.0.0 || ^9.0.0'
+  # versions: '^8.0.0 || ^9.0.0'
+  versions: '9.1.2'  # XXX failure reproduces with just this version tested
   commands: npm test

@trentm
Copy link
Contributor

trentm commented Nov 30, 2023

I was getting ready to blame this on npm install being broken -- for this npm workspaces case and when called in the way that lerna/nx are calling it. However, the import failure is because @cucumber/messages can't be found by the @cucumber/gherkin-streams package ... and @cucumber/gherkin-streams doesn't actually have a dependency on @cucumber/messages. It has a peerDep and a devDep on it.

Only these 3 packages have a "dependencies" entry for @cucumber/messages:

% cat  **/node_modules/@cucumber/*/package.json | json -ga -c 'this?.dependencies && this.dependencies["@cucumber/messages"]' name
@cucumber/gherkin
@cucumber/gherkin-utils
@cucumber/cucumber
@cucumber/gherkin

So, I think ultimately I'd argue that @cucumber/gherkin-streams (and I think at least one other cucumber package) have a bug that they use @cucumber/messages but don't have an "dependencies" entry for it. At least that's for my old-timey understanding of peerDependencies entries being completely advisory. I need to learn a bit more about peerDeps.

Note that our CI sets legacy-peer-deps=true (https://docs.npmjs.com/cli/v8/using-npm/config#legacy-peer-deps):

+      - name: Legacy Peer Dependencies for npm 7
+        if: matrix.container == 'node:16'
+        run: npm config set legacy-peer-deps=true

which was added back in #614 when using hoisting with lerna to speed up npm installs. This could be related, but I don't think so: because IIRC we get this failure in CI with node v14 (which has npm@6 which ignores peerDependencies) as well. Also, locally I've been testing with legacy-peer-deps unset (it defaults to false).

(Aside: I wonder if that means that legacy-peer-deps might not be unnecessary for our node v16 installs.)

@trentm
Copy link
Contributor

trentm commented Nov 30, 2023

GAH! It is totally because of legacy-peer-deps=true.

I ran with this diff:

diff --git a/plugins/node/instrumentation-cucumber/package.json b/plugins/node/instrumentation-cucumber/package.json
index c44c00f9..22312787 100644
--- a/plugins/node/instrumentation-cucumber/package.json
+++ b/plugins/node/instrumentation-cucumber/package.json
@@ -7,7 +7,7 @@
   "repository": "open-telemetry/opentelemetry-js-contrib",
   "scripts": {
     "test": "nyc ts-mocha -p tsconfig.json 'test/**/*.test.ts'",
-    "test-all-versions": "tav",
+    "test-all-versions": "npm config ls && tav #XXX",
     "tdd": "npm run test -- --watch-extensions ts --watch",
     "clean": "rimraf build/*",
     "lint": "eslint . --ext .ts",

A full failing repro run: https://gist.github.com/trentm/efb0a8586351878a74f24256ced15af4
A full passing run (running npm run test-all-versions in the subpackage dir): https://gist.github.com/trentm/f02b71fe09b91bcb6c8d1e693ad22fb1

Note this in the failing run:

> npm config ls && tav #XXX
...
; "env" config from environment
...
legacy-peer-deps = true
...

The passing run does not have legacy-peer-deps set, and signs are pointing to that being the relevant diff.
But what is setting legacy-peer-deps?
I had earlier grepped for legacy-peer-deps and legacyPeerDeps and didn't find anything setting it, but...

% rg -g '*' legacy_peer_deps
node_modules/nx/src/utils/package-manager.js
76:            (_a = (_b = process.env).npm_config_legacy_peer_deps) !== null && _a !== void 0 ? _a : (_b.npm_config_legacy_peer_deps = 'true');

Gah. I haven't traced this yet, but I suspect nx's code path for running commands in sub-packages is setting that config var and that breaks our install.

@trentm
Copy link
Contributor

trentm commented Nov 30, 2023

This passes:

npm_config_legacy_peer_deps=false npx lerna run --scope @opentelemetry/instrumentation-cucumber test-all-versions

TODO:

  • why does nx do this?
  • can we set npm_config_legacy_peer_deps=false for npx lerna run --scope ... test-all-versions for all packages, or do we need to pick and choose?
  • pretty sure we can drop the legacy-peer-deps=true setting in our CI workflows, because we are using npm ci now that just uses the layout described in package-lock.json.

@trentm
Copy link
Contributor

trentm commented Nov 30, 2023

Some nx history:

commit 72ce35264f1dd5f5029e7ac463a83efb252622a7
Date:   2021-05-04T16:02:12-04:00 (2 years, 7 months ago)

    fix(core): change legacy_peer_deps to npm env variable (#5541)

...

commit 2604711f5fc08c7a80d7900094e477510f120ee7
Date:   2021-01-27T21:32:55-05:00 (2 years, 10 months ago)

    fix(core): add --legacy-peer-deps to npm installs to handle npm 7 related issues

The PR and issue do not provide much detail.

I am inclined to avoid nx (and hence lerna) for running scripts in subpackages, just to avoid the bulk of code involved in doing this -- but that is longer term. I'll work on a fix/workaround, probably using npm run --ws .... npm run --ws ... does not support running in parallel, but that's fine for the TAV tests which CI serializes anyway:

      - name: Run test-all-versions
        run: npx lerna run test-all-versions ${{ inputs.lerna-args }} ${{ matrix.lerna-extra-args }} --stream --concurrency 1

@trentm trentm self-assigned this Nov 30, 2023
@trentm
Copy link
Contributor

trentm commented Nov 30, 2023

@aws-sdk/client-s3@3.377.0 repro

To make for a faster repro, first change "plugins/node/opentelemetry-instrumentation-aws-sdk/.tav.yml" to be just this:

"@aws-sdk/client-s3":
  # versions: ">=3.223.0 || 3.218.0 || 3.216.0 || 3.154.0 || 3.107.0 || 3.54.0 || 3.6.1"
  versions: "3.377.0"
  commands:
    - npm run test

Then run:

npm ci
npx lerna run test-all-versions --scope @opentelemetry/instrumentation-aws-sdk

The result:

% npx lerna run test-all-versions --scope @opentelemetry/instrumentation-aws-sdk
lerna notice cli v6.6.2
lerna info versioning independent
lerna notice filter including "@opentelemetry/instrumentation-aws-sdk"
lerna info filter [ '@opentelemetry/instrumentation-aws-sdk' ]

> @opentelemetry/instrumentation-aws-sdk:test-all-versions

> @opentelemetry/instrumentation-aws-sdk@0.37.0 test-all-versions
> tav
-- required packages ["@aws-sdk/client-s3@3.377.0"]
-- installing ["@aws-sdk/client-s3@3.377.0"]
-- running test "npm run test" with @aws-sdk/client-s3 (env: {})
> @opentelemetry/instrumentation-aws-sdk@0.37.0 test
> nyc ts-mocha -p tsconfig.json --require '@opentelemetry/contrib-test-utils' 'test/**/*.test.ts'
Error: Cannot find module '@smithy/hash-stream-node'
Require stack:
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/@aws-sdk/client-s3/dist-cjs/runtimeConfig.js
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/@aws-sdk/client-s3/dist-cjs/S3Client.js
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/@aws-sdk/client-s3/dist-cjs/index.js
- /Users/trentm/tm/opentelemetry-js-contrib4/plugins/node/opentelemetry-instrumentation-aws-sdk/test/aws-sdk-v3.test.ts
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/mocha/lib/esm-utils.js
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/mocha/lib/mocha.js
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/mocha/lib/cli/one-and-dones.js
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/mocha/lib/cli/options.js
- /Users/trentm/tm/opentelemetry-js-contrib4/node_modules/mocha/bin/mocha
    at Function.Module._resolveFilename (node:internal/modules/cjs/loader:1028:15)
    at Function.Module._load (node:internal/modules/cjs/loader:873:27)
    at Module.require (node:internal/modules/cjs/loader:1100:19)
...

the issue

This is a different problem. tl;dr: @aws-sdk/client-s3@3.377.0 was just a bad release.

If we look for usages of */hash-stream-node in an install of @aws-sdk/client-s3@3.377.0:

[~/tmp/asdf.badclients3/node_modules/@aws-sdk/client-s3]
% rg hash-stream-node
package.json
30:    "@aws-sdk/hash-stream-node": "*",

dist-es/runtimeConfig.js
9:import { readableStreamHasher as streamHasher } from "@smithy/hash-stream-node";

dist-cjs/runtimeConfig.js
13:const hash_stream_node_1 = require("@smithy/hash-stream-node");

We can see that the code changed to using the @smithy/hash-stream-node module, but the package.json "dependencies" still point to the old @aws-sdk/hash-stream-node package. Neither the release notes for 3.377.0 nor 3.378.0 say anything about it. However, 3.378.0 was released the next day:

% npm info @aws-sdk/client-s3 time
...
  '3.377.0': '2023-07-25T21:15:57.826Z',
  '3.378.0': '2023-07-26T19:35:54.575Z',
...

And installing 3.377.0 yields a manually added npm deprecation warning:

% npm install @aws-sdk/client-s3@3.377.0
npm WARN deprecated @aws-sdk/hash-stream-node@3.374.0: This package has moved to @smithy/hash-stream-node
...

the fix

I think we just need to make sure to avoid testing this version in the TAV tests. I'll update the .tav.yml. It could also use an update because currently it is testing ~144 versions of @aws-sdk/client-s3.

trentm added a commit to trentm/opentelemetry-js-contrib that referenced this issue Nov 30, 2023
- Switch to 'npm run --ws test-all-versions ...' for running TAV tests,
  instead of 'lerna run test-all-versions ...' because nx sets
  'npm_config_legacy_peer_deps=true' which breaks
  '@cucumber/cucumber@9.1.2' install and could break other installs by
  ignoring 'peerDependencies'.
- Skip the bad '@aws-sdk/client-s3@3.377.0' release in TAV tests.

Also:
- Reduce the number of versions of 'aws-sdk' and '@aws-sdk/*' packages
  test in TAV tests from 249, 143, and 132 versions currently, to 7
  each.
- Add a top-level `npm run test-all-versions` script to run that script
  in all packages that have one. This is the equivalent of the
  "test-all-versions.yml" CI workflow.

Fixes: open-telemetry#1828
@trentm trentm removed the needs:reproducer This bug/feature is in need of a minimal reproducer label Nov 30, 2023
trentm added a commit that referenced this issue Dec 11, 2023
)

* test: fix TAV tests for @cucumber/cucumber and @aws-sdk/client-s3

- Switch to 'npm run --ws test-all-versions ...' for running TAV tests,
  instead of 'lerna run test-all-versions ...' because nx sets
  'npm_config_legacy_peer_deps=true' which breaks
  '@cucumber/cucumber@9.1.2' install and could break other installs by
  ignoring 'peerDependencies'.
- Skip the bad '@aws-sdk/client-s3@3.377.0' release in TAV tests.

Also:
- Reduce the number of versions of 'aws-sdk' and '@aws-sdk/*' packages
  test in TAV tests from 249, 143, and 132 versions currently, to 7
  each.
- Add a top-level `npm run test-all-versions` script to run that script
  in all packages that have one. This is the equivalent of the
  "test-all-versions.yml" CI workflow.

Fixes: #1828
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working internal pkg:instrumentation-aws-sdk pkg:instrumentation-cucumber priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants