Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV handler causes infinite loop if signal raised from within #25762

Closed
gireeshpunathil opened this issue Jan 28, 2019 · 9 comments
Closed
Labels
lib / src Issues and PRs related to general changes in the lib or src directory. process Issues and PRs related to the process subsystem.

Comments

@gireeshpunathil
Copy link
Member

process.on('SIGSEGV', (s) => {
  console.log(`program crashed with ${signum}`)
})
// some code that causes node to hard crash, not abort (such as *(int *) 0 = 0;)

this causes node to loop tight with high CPU. On the first look I was thinking it keeps calling the handler but that was not the case.

in libuv, the signals (internal or external) are intercepted in uv__signal_handler .

  • After all the processing, it returns from the signal stack, which causes the faulty instruction to be re-entered, leading to the tight loop.
  • it writes to the signal's pipefd, but the thread is never able to reach there to process it and invoke the callback.

These questions come to my mind:

  • Is it meaningful to resume to the execution context on internal fatal signals?
  • If yes, where should it resume?
  • If no, what it should do?

IMO the embedder should decide / control these?

@nodejs/libuv @nodejs/process

@addaleax
Copy link
Member

Some prior context: #8410 and #8601

@gireeshpunathil
Copy link
Member Author

thanks @addaleax for the additional context. Unfortunately I am unable to acknowledge that the description in the doc to be sufficient here - even if we take out the JS handler from the picture, any consumer that embeds libuv, installs a handler for SIGSEGV and succumbs to it will tight loop in the same manner.

IMO internal segfaults (and its family) should be handled synchronously. Asynchronous handling looks to be neither feasible, nor meaningful.

there is a longer version of the signal handler callback that provides additional context of the signal source and the execution context. IMO that may be used to figure out the signal's background and if it is internal, the callback should be called in line (first before the running through the chain), and the decision on the resumption of execution should be controllable by the embedder.

At present no code is at risk, but the problem determination of hard crashes become difficult with the current design.

@addaleax
Copy link
Member

@gireeshpunathil Just to clarify, I don’t think it’s possible to do anything meaningful with JS code here – we can’t call that from a signal handler.

@gireeshpunathil
Copy link
Member Author

@addaleax - I agree. I was referring to the second level callback that is registered by the embedder - such as a C++ OnSigSegv from Node. Potentially it could collect a report for example?

@sam-github
Copy link
Contributor

@gireeshpunathil you've seen the docs, right?

http://docs.libuv.org/en/v1.x/signal.html

Watchers for other signals can be successfully created, but these signals are never received. These signals are: SIGILL, SIGABRT, SIGFPE, SIGSEGV, SIGTERM and SIGKILL.

@gireeshpunathil
Copy link
Member Author

@sam-github - aren't those windows specific caveats?

I would like to consider this is as node bug: Exposing a JS API (process.on('signal'...); but not registering a handler of its own, and relying on libuv's; which rightly attempts to delegate the handler chains (but none exists) and eventually returns back to the execution context that restarts the cycle. But want to seek consensus.

A way to address this:

  • Node installs a generic handler before it initializes libuv
  • so that libuv remembers this and calls when it receives signal
  • within node's handler we could detect internal segfaults, and
    • invoke JS callback if one is present
    • do reporting etc. if that is enabled
  • decide what to do next: for fatal faults, abort.
  • alternatively invoke node::OnFatalError

Examples of the class of issues this could potentially cover as #25814 and #25512

@sam-github
Copy link
Contributor

Sorry, yes, wrong text. Similar text in node.js docs, though:

'SIGBUS', 'SIGFPE', 'SIGSEGV' and 'SIGILL', when not raised artificially using kill(2), inherently leave the process in a state from which it is not safe to attempt to call JS listeners. Doing so might lead to the process hanging in an endless loop, since listeners attached using process.on() are called asynchronously and therefore unable to correct the underlying problem.

^--- node docs, not windows specific. If its documented behaviour, its not formally a bug.

Is it meaningful to resume to the execution context on internal fatal signals?

I think not. Sync signals like the above cannot be ignored or fixed, hoping to continue on executing js under these conditions seems unreasonable.

@gireeshpunathil
Copy link
Member Author

@sam-github - agree that the current behavior is documented. However, for me it looks like an explanation to or a warning about an un-natural behavior; does not touch upon:

  • what is the recommended way to deal with segfaults
  • what flags / options exists to capture segfaults
  • what differentiates fatalerror (that v8 and node captures) with segfaults

Just to re-iterate:

  • JS callback on machine check exception is not what I desire, the code in the description is only for demonstration
  • embedders should receive synchronous signals synchronously
  • embedders should have a say on what to do with the signals

thoughts?

gireeshpunathil added a commit to gireeshpunathil/node that referenced this issue Feb 4, 2019
`--diagnostic-report-on-fatalerror` does not capture internal
segfaults at present, instead limited to only faults that are
detected by the Node / v8. libuv's signal handler deliveres
result asynchronously which does not help for SIGSEGV as:
i) the synchronous sequence restarts the failing code,
ii) due to i) the callback is never delivered to the consumer.

Install a SIGSEGV handler in Node and follow the sequence of
fatalerror handling, as of now at least under report.

Fixes: nodejs#25762
@jasnell jasnell added lib / src Issues and PRs related to general changes in the lib or src directory. process Issues and PRs related to the process subsystem. labels Jun 26, 2020
@bnoordhuis
Copy link
Member

This issue has been open for years but I don't think it's actionable (it just is what it is) so I'm going to go ahead and close it.

@bnoordhuis bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lib / src Issues and PRs related to general changes in the lib or src directory. process Issues and PRs related to the process subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants