In run_process, what should happen if the deliver_cancel function raises an exception? #1532

njsmith · 2020-05-18T19:04:00Z

What should happen if deliver_cancel raises an exception? In the current implementation, the shielded cancel scope will prevent it from propagating until the process exits, which might take a while if the crash occurred before signaling the process in any way. Maybe on exception from a user-specified deliver_cancel we should call the default deliver_cancel to kill the process? Or just kill() since we don't care much about polish at that point, only about letting the user know what they need to fix.

Originally posted by @oremanj in #1525

njsmith · 2020-05-19T04:41:18Z

Yeah, this is tricky. On the one hand, you kind of have to report the failure, b/c "program hangs silently" is terrible UX. On the other hand, you kind of can't report the failure, b/c run_process hasn't finished yet.

That's why I ended up putting warnings.warn calls in the default deliver_cancel – it gives the user an immediate heads-up, even though we can't propagate the error properly.

Maybe we should pull the try: ... except: warnings.warn(...) out of the default deliver_cancel, and put it into the code that calls deliver_cancel instead?

I guess for exceptions out of arbitrary user code, it would be nice to include the full traceback. That's an awkward fit for the warnings machinery though.

In a few other places where we have this kind of problem (instrument failures, EMFILE in serve_listeners), we use logging to dump the warning onto the console, with the idea being that by default the user sees it, and at least its hookable if you want to do something else in production. Maybe we should do the same here? try: await deliver_cancel(proc) except: L = logging.getLogger("trio.run_process"); L.exception(...); raise?

oremanj · 2020-05-19T08:00:50Z

Logging the exception, and still waiting for the process to exit, seems like a good compromise to me. The log message should explicitly say that it's still waiting for the process to exit, so that the reason for the apparent deadlock is as obvious as we can make it.

guilledk · 2020-05-21T23:42:06Z

Hey guys so I'm gonna take a crack at this issue.
If I got this right the idea would be a change like, moving the try: .. except to trio._subprocess.run_process.open_process:

try:
    await proc.wait()
except trio.Cancelled:
    with trio.CancelScope(shield=True):
        killer_cscope = trio.CancelScope(shield=True)

        async def killer():
            with killer_cscope:
                try:  # here is the new try except
                    await deliver_cancel(proc)
                except OSError as exc:
                    warnings.warn(
                        RuntimeWarning(
                            f"tried to kill process {proc!r}, but failed with: {exc!r}"
                        )
                    )

        nursery.start_soon(killer)
        await proc.wait()
        killer_cscope.cancel()
        raise

Also what is a good way to trigger an OSError to test this?, just closing the process manually while trio thinks is open will probably work I think, but maybe there is a cleaner way.

njsmith · 2020-05-21T23:47:07Z

If I got this right the idea would be a change like, moving the try: .. except to trio._subprocess.run_process.open_process:

You mean trio._subprocess.run_process.killer, right?

Also what is a good way to trigger an OSError to test this?

You can also just do:

async def custom_deliver_cancel(proc):
    raise OSError("whoops!")

Also:

You'll want to catch BaseException and then re-raise
We'll want to switch to using logging instead of warnings. For an example of what this might look like, check out trio/_highlevel_server_listeners.py and how LOGGER is used there.

guilledk · 2020-05-23T22:46:48Z

Sorry can you expand on the BaseException point?

njsmith · 2020-05-23T22:57:13Z

@guilledk In your comment, you had try/except OSError. For the original version where we were just catching errors from the default deliver_cancel, that was fine, because we knew OSError was the only thing that could be raised. But now we'll be catching errors from user code, so we want to instead do try/except BaseException. And then afterwards, we want to re-raise the exception.

guilledk · 2020-05-23T23:00:47Z

Oh ok got it

njsmith mentioned this issue May 18, 2020

Allow customization of run_process cancellation behavior #1525

Merged

njsmith changed the title ~~What should happen if deliver_cancel raises an exception?~~ In run_process, what should happen if the deliver_cancel function raises an exception? May 19, 2020

oremanj added subprocesses good first issue labels May 19, 2020

pquentin mentioned this issue May 21, 2020

Fixed unhelpful error message in from_thread_run functions. #1513

Merged

guilledk linked a pull request May 27, 2020 that will close this issue

In run_process moved deliver_cancel exception handling to the caller code #1555

Open

Zac-HD removed the good first issue label May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In run_process, what should happen if the deliver_cancel function raises an exception? #1532

In run_process, what should happen if the deliver_cancel function raises an exception? #1532

njsmith commented May 18, 2020

njsmith commented May 19, 2020

oremanj commented May 19, 2020

guilledk commented May 21, 2020 •

edited

njsmith commented May 21, 2020

guilledk commented May 23, 2020

njsmith commented May 23, 2020

guilledk commented May 23, 2020

In run_process, what should happen if the deliver_cancel function raises an exception? #1532

In run_process, what should happen if the deliver_cancel function raises an exception? #1532

Comments

njsmith commented May 18, 2020

njsmith commented May 19, 2020

oremanj commented May 19, 2020

guilledk commented May 21, 2020 • edited

njsmith commented May 21, 2020

guilledk commented May 23, 2020

njsmith commented May 23, 2020

guilledk commented May 23, 2020

guilledk commented May 21, 2020 •

edited