Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate flaky test-child-process-exec-abortcontroller-promisified on Windows CI #37568

Closed
Trott opened this issue Mar 2, 2021 · 9 comments
Labels
child_process Issues and PRs related to the child_process subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. windows Issues and PRs related to the Windows platform.

Comments

@Trott
Copy link
Member

Trott commented Mar 2, 2021

Surprised there isn't already an issue for this one. It's been happening a lot lately. I think it is only on the 32-bit Windows in CI.

Here's an example:

  • Test: test-child-process-exec-abortcontroller-promisified
  • Platform: Win2012r2 VS 2013 x64 (test-rackspace-win2012r2_vs2013-x64-1)
  • Console Output:
00:29:58 not ok 72 parallel/test-child-process-exec-abortcontroller-promisified
00:29:58   ---
00:29:58   duration_ms: 0.174
00:29:58   severity: fail
00:29:58   exitcode: 1
00:29:58   stack: |-
00:29:58     node:internal/process/promises:245
00:29:58               triggerUncaughtException(err, true /* fromPromise */);
00:29:58               ^
00:29:58     
00:29:58     [AssertionError [ERR_ASSERTION]: post aborted sync signal failed] {
00:29:58       generatedMessage: false,
00:29:58       code: 'ERR_ASSERTION',
00:29:58       actual: Error: Command failed: TIMEOUT 120
00:29:58       ERROR: Input redirection is not supported, exiting the process immediately.
00:29:58     
00:29:58       
00:29:58           at ChildProcess.exithandler (node:child_process:326:12)
00:29:58           at ChildProcess.emit (node:events:378:20)
00:29:58           at maybeClose (node:internal/child_process:1067:16)
00:29:58           at Socket.<anonymous> (node:internal/child_process:453:11)
00:29:58           at Socket.emit (node:events:378:20)
00:29:58           at Pipe.<anonymous> (node:net:671:12) {
00:29:58         killed: false,
00:29:58         code: 1,
00:29:58         signal: null,
00:29:58         cmd: 'TIMEOUT 120',
00:29:58         stdout: '',
00:29:58         stderr: 'ERROR: Input redirection is not supported, exiting the process immediately.\r\n'
00:29:58       },
00:29:58       expected: /AbortError/,
00:29:58       operator: 'rejects'
00:29:58     }
00:29:58   ...
@Trott
Copy link
Member Author

Trott commented Mar 2, 2021

This seems likely to have been introduced by #37325. @Linkgoron @benjamingr

@Trott
Copy link
Member Author

Trott commented Mar 2, 2021

@nodejs/testing @nodejs/platform-windows @nodejs/child_process

@Trott Trott added child_process Issues and PRs related to the child_process subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. windows Issues and PRs related to the Windows platform. labels Mar 2, 2021
@Linkgoron
Copy link
Member

Linkgoron commented Mar 2, 2021

Yes, this is obviously my added test. Looks an issue with background timeout:
https://www.ibm.com/support/pages/timeout-command-run-batch-job-exits-immediately-and-returns-error-input-redirection-not-supported-exiting-process-immediately

I'd be happy to replace the timeout command with anything long-running. The test needs something long-running enough to give the node process time to kill it. The above suggests using ping -n 10 127.0.0.1 >NUL (for a 10 second pause). Another option on Windows would be executing a node process with the "never-ending" js fixture. Originally, that was how I wanted to write the test but it doesn't work on Linux (as killing the cp that's created with exec on linux doesn't kill the underlying Node process because of the shell, but on windows it works as expected).

Have you seen any issues with the sleep command, which is what I used on Linux?

@aduh95
Copy link
Contributor

aduh95 commented Mar 2, 2021

Could we use node -e 'setTimeout(()=>{}, 9999)'? It should be cross platform work on Windows.

@Trott
Copy link
Member Author

Trott commented Mar 2, 2021

Could we use node -e 'setTimeout(()=>{}, 9999)'? It should be cross plateform.

Or even node -e 'setInterval(()=>{}, 9999)'? (The overall test timeout is handled by tools/test.py so we can rely on that for platform-specific timeouts automatically.)

@Linkgoron
Copy link
Member

Linkgoron commented Mar 2, 2021

Could we use node -e 'setTimeout(()=>{}, 9999)'? It should be cross plateform.

Yes, but it needs something with a low enough timeout, so that it would die close enough to the end of the test but also high enough so that it would 100% get killed by the node process (test) before the timeout is up. The issue is that at the end of the tests, the tests use ps awwx to check if there's a node process that's still running.

@Trott setInterval didn't work for me, that was my original solution - before relying on TIMEOUT/sleep. see #37518

@Linkgoron
Copy link
Member

Could we use node -e 'setTimeout(()=>{}, 9999)'? It should be cross platform work on Windows.

On Windows, even setInterval would work

@aduh95
Copy link
Contributor

aduh95 commented Mar 2, 2021

A potential fix would be to keep using sleep on linux, and use setInterval everywhere else, right?

@Linkgoron
Copy link
Member

I'm not sure why we would change everything if the issue is only on windows, but your fix should work.

aduh95 added a commit to aduh95/node that referenced this issue Mar 4, 2021
@jasnell jasnell closed this as completed in 4fde7dc Mar 5, 2021
danielleadams pushed a commit that referenced this issue Mar 16, 2021
Fixes: #37568

PR-URL: #37572
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this issue Sep 7, 2021
Fixes: #37568

PR-URL: #37572
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. windows Issues and PRs related to the Windows platform.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants