New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exceptions.throwIfFatal / throwIfJvmFatal should log fatal exceptions #3111
Comments
I think the logging of fatal errors would be the most straightforward solution. would it be enough of a step forward @elefeint ? wrapping the fatal exception isn't something that can currently easily be implemented without painstakingly exploring and refactoring the whole call tree of |
@simonbasle I don't think we should and can do anything here but logging. Another alternative can be adding something like @elefeint, it is worth mentioning that
which means that Also, it is worth mentioning that a fatal error may mean internal state corruption (because they can be thrown unexpectedly) so further communication with the chain could be impossible. Thus, wrapping sink calls makes no sense at all, especially calling |
I agree that logging has the highest benefit-to-effort. It would absolutely solve the biggest problem of people not knowing where to start looking for a problem. For the second part -- allowing sinks to recover, the amount of work and payoff are both less clear. This behavior is definitely documented in While it's true that errors like OOM are likely to leave the application in a bad state regardless of what reactor sinks do, right now the |
It is not even related to the application state, it is related to the internal state-machine implementation of the project reactor. We do not assume that everything should blow at any moment, we assume the majority of method calls that we do returns normally. If an error like OOM happens spuriously, we just bubble it up, and eventually, it is handled at some point (so can be logged). However, there is no guarantee that the place from which the error was propagated is also handled exception properly, etc. All that leads to the fact that some volatile or non-volatile states that we maintain may be corrupted because of that. After all, it may end up in the impossibility to progress further at all (e.g. sending other signals). |
I'll take that one and add logging to |
FluxSink
to terminate even for Throwable
/Error
This commit improves the discoverability of exceptions that are "fatal" to Reactor by: - adding `Exceptions.isFatal` and `Exceptions.isJvmFatal` methods - adding logging to `Exceptions.throwIfFatal` and `Exceptions.throwIfJvmFatal` just before a fatal exception is thrown This should at least help pinpointing such occurrences from the logs, in cases where the thrown exception bubble all the way up the stack of a thread with no uncaughtExceptionHandler for example. Fixes #3111.
It would be helpful to add more control over fatal exceptions. In my case, a large object was attempted to be loaded into memory which caused of memory exception, which was not fatal for application in general but only for a particular request that was too large. We didn't notice any errors, nothing was in the log and Flux just kept hanging forever, because throwIfFatal was thrown before a cancelation of the subscription. It was really hard to find the cause of why Flux was hanging forever. Since we consider that OutOfMemoryException is not fatal for us, now we need to find all the places in the code where we could potentially load a large object in memory, try-catch, and in case of OutOfMemoryException to convert to a nonfatal version of exception. Instead of just adding a log, I would propose to add a |
@schananas if you have pieces of code that are susceptible to OOM but you're confident you can recover from these, then you are the best suited to implement that. I'd argue only you can take the responsibility to wrap such code in a with the introduction of |
This commit improves the discoverability of exceptions that are "fatal" to Reactor by: - adding `Exceptions.isFatal` and `Exceptions.isJvmFatal` methods - adding logging to `Exceptions.throwIfFatal` and `Exceptions.throwIfJvmFatal` just before a fatal exception is thrown This should at least help pinpointing such occurrences from the logs, in cases where the thrown exception bubble all the way up the stack of a thread with no uncaughtExceptionHandler for example. Fixes #3111.
Related issues: #3036, #2521, #2917, #1118
The combination of unrecoverable
Error
subclasses and reactive/asynchronous applications can easily lead to applications being left in semi-working state with no visible errors being logged.Motivation
In an imperative, blocking application, an unrecoverable
Error
subclass would lead to the main application thread terminating with a stack trace. However, in reactive/asynchronous applications, theError
is often propagated up to a thread in some thread pool, which upon termination is peacefully returned to the pool. Unless some layer of the application catchesThrowable
s and logs the error, it will be invisible to the end users and application maintainers. On the other hand, Reactor sinks/subscriptions will be left in a "hanging" state -- neither working, nor terminated -- leading to hard to catch and hard to debug situations.The specific scenario that caused me to look into this was a minor library version incompatibility that resulted in
java.lang.NoSuchMethodError
being thrown in some stream mapping logic. While unrecoverable from both Java perspective (because it's anError
) and from ReactorthrowIfJvmFatal
perspective (because it's aLinkageErrror
), this type of issue tends to be isolated to a particular code path and perhaps does not need to lead to application shutdown. At minimum, it would be good to have consistent logging of unrecoverable errors.Desired solution
In order of the amount of work involved.
Exceptions.throwIfJvmFatal
.Error
into stream errors, it would be good if the downstream code could recover if desired. This would require thatFluxSink
is left in a state in which it's capable of terminating -- in thereactorFluxSinkUnrecoverableError()
test below, it would meansink.error(e);
results in the proper stream termination (although the exception passed in to the sink probably needs to be wrapped in something that won't re-triggerthrowIfJvmFatal
behavior).Considered alternatives
Leave everything as is; not handling
Error
subclasses works as intended. While true, it does lead to developers spending a lot of time troubleshooting -- see related issues. And that's only the people who did figure out what was going wrong! It's a valid option though.Additional context
The following two tests demonstrate the difference in behavior between normal exceptions being handled and converted to stream errors, and unrecoverable errors leading to a hanging stream.
The text was updated successfully, but these errors were encountered: