Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: execa leaving processes around when running npm scripts in a sh shell #433

Open
ManiacDC opened this issue Aug 14, 2020 · 11 comments
Labels

Comments

@ManiacDC
Copy link

ManiacDC commented Aug 14, 2020

Our group uses lerna to execute npm scripts in our packages, and lerna in turn uses execa (albeit an old version, but my reproduction uses the current version) to execute the npm scripts. We've set npm to use "sh" as our default shell so that programmers working on multiple platforms can use the same scripts. The below issue has become a major roadblock for our work.

We've run into an issue where node.exe processes are left around when terminating the main process - execa is not cleaning up the full process tree. This happens on Windows, when "sh" is the shell npm is set to use, and the script loops infinitely until terminated (such as 'tsc -w'). To make sure this is clear, it does NOT happen when running 'npm run' directly, only when run from execa.

I've created a simple tool to reproduce this (modified from Ciantic's work for issue #429 ):
https://github.com/ManiacDC/execa-npm-sh-issue

Please see the readme for instructions and expected results.

This may or may not be related to issue #429 , though it's definitely similar in ways.

@ManiacDC ManiacDC changed the title Windows: execa leaving processes around when npm scripts in a bash shell Windows: execa leaving processes around when running npm scripts in a bash shell Aug 14, 2020
@ManiacDC ManiacDC changed the title Windows: execa leaving processes around when running npm scripts in a bash shell Windows: execa leaving processes around when running npm scripts in a sh shell Aug 14, 2020
@ehmicky
Copy link
Collaborator

ehmicky commented Aug 14, 2020

Hi @ManiacDC,

Thanks for reporting this.

It seems that the problem is due to the windowsHide option.

npm is not a binary but a shell file. For example, if you were to call it like this:

const childProcess = require('child_process')
childProcess.spawn('npm', ['run', 'tsc'], { stdio: 'inherit' })

You would receive the following error message (on Windows only):

Error: spawn npm ENOENT

Instead, npm requires using cmd.exe or Bash to be loaded. The shell option does that. For example, the following code works:

childProcess.spawn('npm', ['run', 'tsc'], { stdio: 'inherit', shell: true })

(Note: If you specify a shell file (as opposed to a binary), on Windows, and do not use the shell option, Execa emulates that option for you (through the dependency node-cross-spawn) by basically running cmd /d /s /c /q "npm run tsc" instead of npm run tsc. But that's basically what Node.js does with the shell option under the hood.)

By default, Execa terminates the child process when its parent process terminates using something similar to:

const childProcess = require('child_process')
const spawned = childProcess.spawn('npm', ['run', 'tsc'], { stdio: 'inherit', shell: true })
process.on('exit', () => { spawned.kill() })

Note that there are four processes here: node -> cmd.exe -> npm -> tsc. Usually, on Windows, if the process is not detached, the descendant processes are terminated as well.

The problem in your case seems to be related to the windowsHide option (more details on that option here) which has a different default value than in core Node.js child_process (see #371). This seems to interfere with terminating the descendant processes. In particular, process.on('exit') and spawned.kill() get properly executed, but the descendant processes are not terminated when windowsHide is true (the default value in Execa).

Could you please try to pass the windowsHide: false option and see if this fixes your problem?

The problem can be reproduced with the following code (which overly simplifies what Execa does under the hood):

const childProcess = require('child_process')
const spawned = childProcess.spawn('npm', ['run', 'tsc'], { stdio: 'inherit', shell: true, windowsHide: true })
process.on('exit', () => { spawned.kill() })

The tsc script runs tsc -w (but any long-running npm script would probably do). When this file is run with node, then terminated with CTRL-C in a Windows console, the descendant processes are not terminated. This only happens when all of the following conditions are true:

  • a Windows console is used (either cmd.exe or Bash). Powershell works fine. Unix works fine.
  • windowsHide is true (the default value)
  • a shell command (non-binary) is being executed with the shell: true option

@sindresorhus What are your thoughts on this?

@ManiacDC
Copy link
Author

ManiacDC commented Aug 15, 2020

I should point out that lerna is uses shell: true, though I didn't think it made a difference since the issue occurred with or without it.

I tested with all 3 of those settings, and that did indeed fix the problem with my reproduction script.

I tried tweaking lerna to fix this briefly, but wasn't able. On Monday I'll spend some more time trying to incorporate this into a build of lerna to see if that addresses the issue there (it may be the version of execa its using - 1.0.0 - doesn't have the mentioned flags). Thanks so much for your help on this.

@raijinsetsu
Copy link

It seems weird to switch NodeJS's default value for windowHide. Is there a discussion on why this was done?

In regards to fixing this in dependent packages... Lerna, for example, calls execa in 57 different places, so changing windowHide back to false would be problematic. Would execa devs be open to something like:

execa.defaultOptions = {windowHide: false};

Then a small change here:

execa/index.js

Line 45 in 26d6b0d

windowsHide: true,

From

windowsHide: true,
...options

to

windowsHide: true,
...execa.defaultOptions,
...options

@ehmicky
Copy link
Collaborator

ehmicky commented Aug 15, 2020

I hear you, especially the fact that this is difficult to change when execa is called from dependencies.

Some additional issues and discussions about the windowsHide option:

While I see how changing the default of windowsHide back to Node.js default value (false) would solve the problems you are mentioning, I am wondering if this would create side effects for other users (which would then create issues to revert it back to true...).

I think (as a Linux user) I am missing some in-depth knowledge of the Windows API to make a good judgment here. I am also not sure if one of the Node.js bugs mentioned above is the actual source of the problem. My initial feeling would be to change the default value of windowsHide to false because this is the default value of Node.js child_process and because Windows users seem to encounter issues with the current value.

However, I would be interested to know @sindresorhus feedback on this, as he has more insights into it.

Note: I would personally discourage allowing customizing Execa default options with the intent to change how one's dependencies are using Execa. This would imply trying to change how one's dependency is using its own code/dependencies, which does not seem like a pattern we should encourage as it breaks encapsulation. That being said, there might be alternative solutions that would solve the same underlying problem, such as changing the default value of that option.

@raijinsetsu
Copy link

It's been a while, but the way I remember the Windows API to function is that use use CREATE_NO_WINDOWS when launching a console application from a non-console application to prevent the creation of a console window. When launching one console application from another, that does not create additional windows. This is confirmed by the option CREATE_NEW_CONSOLE which states: The new process has a new console, instead of inheriting its parent's console (the default).

The only issue is when PROCESS_DETACHED is specified because you cannot use that in conjunction with either CREATE_NO_WINDOW or CREATE_NEW_CONSOLE.

So, "windowsHide" would make sense for, as an example, Electron apps because they are not GUI applications. When running node from another console application, the option is meaningless.

By allowing a tool like Lerna to set the default option itself without having to specify it in every invocation, it allows those tools to expose that functionality to us. For example: "lerna --gui" might set the default to {windowsHode: true} so it can be invoked from GUI applications without creating a separate console window.

@sindresorhus
Copy link
Owner

Users will complain either way. If we change the default, it will be a disruptive and breaking change for many users. There is no clear win-win situation as far as I can see.

I don't use Windows and I'm honestly a bit burned out of having to spend so much time in general (for all my projects) working around Windows deficiencies.

@raijinsetsu
Copy link

raijinsetsu commented Aug 16, 2020 via email

@ManiacDC
Copy link
Author

I agree with @raijinsetsu 's suggestion. It allows applications to set the default for themselves, and the applications are really the only ones that can know the proper way to launch these new processes.

@ManiacDC
Copy link
Author

ManiacDC commented Aug 18, 2020

@ehmicky after some more testing, I realized that 'windowsHide: false' setting only fixes the "node test.js" case (which is NOT the case lerna falls into). The "sh -c 'node test.js'" and "./launch.sh" cases still leave the orphaned process around.
In those 2 cases, the only way I've found to get the orphaned process to go away to is set the stdin to "inherit". I'm not sure what other side-effects such a change would have.

Worth noting, "node test.js" launches the parent node.exe process with winpty.exe, but the other 2 launch the parent node.exe process with sh.exe. I'm wondering if this is a git-for-windows/msys2 issue at this point :/

@ehmicky
Copy link
Collaborator

ehmicky commented Aug 18, 2020

While the execa.defaultOptions solution does solve your problem, I think this would create problems for other users. For example, an application might use execa.defaultOptions to change the windowsHide option on one of its dependencies (like Lerna). But that would also apply for all its other dependencies, which might create unsolvable conflicts. In general, allowing a user to manipulate deep dependencies breaks encapsulation, and would create a new set of problems.

@sindresorhus I have been searching core Node.js issues about windowsHide to see if users were complaining about the windowsHide option defaulting to false in core Node.js. What I found:

  • The initial request is here and there
  • Core developers started suggesting changing the default value to true here and implemented here. Some users complained about it here (problem with embedders), here (problem with Node.js spawning GUI like Electron).
  • Discussions to revert to the default value to false were started here and eventually landed there. I see @sindresorhus you were strongly opposing this here

It looks like no users has opened any issues for changing the default value to true (this was initiated by a core developer instead). The few issues are more related to the problem with having it as true (for example, issues with SIGINT).

If I take @raijinsetsu comment:

It's been a while, but the way I remember the Windows API to function is that use use CREATE_NO_WINDOWS when launching a console application from a non-console application to prevent the creation of a console window. When launching one console application from another, that does not create additional windows. This is confirmed by the option CREATE_NEW_CONSOLE which states: The new process has a new console, instead of inheriting its parent's console (the default).
The only issue is when PROCESS_DETACHED is specified because you cannot use that in conjunction with either CREATE_NO_WINDOW or CREATE_NEW_CONSOLE.
So, "windowsHide" would make sense for, as an example, Electron apps because they are not GUI applications. When running node from another console application, the option is meaningless.

This would mean spawning a non-console process from another non-console process would not create a new console window, even if windowsHide is false, is that correct? Instead, this would only be a problem when spawning GUI processes.

The initial discussions in Node.js core were not mentioning the child processes termination problem, since that termination logic is an Execa-specific feature. To me, it seems that, when added to the points above, changing the default value of windowsHide to false is the lesser evil (although not ideal).

However, I think @sindresorhus you've got some good points above, so if the points above are still not convincing you, I'd suggest keeping windowsHide default value as is, and submit a PR to lerna for them to pass windowsHide: false instead.

@Ciantic
Copy link

Ciantic commented Sep 5, 2020

I suspect all the ways that does not use the Job Object in Windows are hacks that will fail under some circumstances.

Raymond Chen is old time Microsofter who explains how to clean up child processes correctly with Windows in this article. I'm pretty certain that any other way of doing it is not guaranteed by the Windows OS to kill the child processes. That is the canonical way to start child processes so that they are also killed if parent process dies.

pkaminski added a commit to pkaminski/snowpack that referenced this issue Sep 10, 2020
…t properly.

plugin-run-script and plugin-build script use execa.command with the shell: true
option set to execute the user's command.  On Windows, this causes the child process
to remain running after Snowpack exits: sindresorhus/execa#433.
The fix is to override the default windowsHide option -- as Snowpack always runs from
a console we don't actually need execa to hide any extra console windows for us.
FredKSchott pushed a commit to FredKSchott/snowpack that referenced this issue Sep 11, 2020
…t properly. (#1022)

plugin-run-script and plugin-build script use execa.command with the shell: true
option set to execute the user's command.  On Windows, this causes the child process
to remain running after Snowpack exits: sindresorhus/execa#433.
The fix is to override the default windowsHide option -- as Snowpack always runs from
a console we don't actually need execa to hide any extra console windows for us.
ehmicky pushed a commit to netlify/cli that referenced this issue Aug 31, 2021
`execa` defaults to `windowsHide: true`, which prevents child processes
from terminating child processes on exit [1].

Let's set `windowsHide: false` to terminate child processes properly on
exit.

[1]: sindresorhus/execa#433
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants