Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System::stop_on_panic is removed. #80

Open
mikhailOK opened this issue Dec 20, 2019 · 10 comments
Open

System::stop_on_panic is removed. #80

mikhailOK opened this issue Dec 20, 2019 · 10 comments

Comments

@mikhailOK
Copy link

I'm using actix-rt 1.0.0 through actix 0.9.0 and observing that system does not stop on panic.
It seems like task panics get caught by tokio runtime 0.2.6.

Relevant tokio issue: tokio-rs/tokio#2002

It also seems like actix-rt 0.2 always stopped on panic due to tokio 0.1 not catching.

My current workaround is to set a panic hook that sends a stop signal to current system, so another question: is it possible to make System::with_current do nothing if current system is not set? System::current() and System::with_current require the caller to guarantee it's set.

@XX
Copy link

XX commented Jan 10, 2020

@mikhailOK How do you emulate the stop on panic with panic hook?

@mikhailOK
Copy link
Author

@XX currently this way:

        let default_hook = std::panic::take_hook();
        std::panic::set_hook(Box::new(move |info| {
            actix::System::with_current(|sys| sys.stop_with_code(1));
            default_hook(info);
        }));

@adwhit
Copy link

adwhit commented Aug 3, 2020

I dug into this a little bit. It seems that actix-server code still assumes that a worker can crash (i.e. uncaught panic) and it has a recovery mechanism in place; whereas with the newer tokio that is not really possible.

See for example here:

ServerCommand::WorkerFaulted(idx) => {
let mut found = false;
for i in 0..self.workers.len() {
if self.workers[i].0 == idx {
self.workers.swap_remove(i);
found = true;
break;
}
}
if found {
error!("Worker has died {:?}, restarting", idx);
let mut new_idx = self.workers.len();
'found: loop {
for i in 0..self.workers.len() {
if self.workers[i].0 == new_idx {
new_idx += 1;
continue 'found;
}
}
break;
}
let worker = self.start_worker(new_idx, self.accept.get_notify());
self.workers.push((new_idx, worker.clone()));
self.accept.send(Command::Worker(worker));
}
}

and here:
Err(tmp) => {
self.srv.worker_faulted(self.workers[self.next].idx);
msg = tmp;
self.workers.swap_remove(self.next);
if self.workers.is_empty() {
error!("No workers");
self.backpressure(true);
return;
} else if self.workers.len() <= self.next {
self.next = 0;
}
continue;
}

tokio do not seem to want to change the current behaviour (mandatory catching of panics within tokio) so we should consider a refactor to reflect that.

@fakeshadow
Copy link
Contributor

use panic = "abort" if you want to exit on panic. tokio::spawn catch panic the same way as std::thread::spawn

actix-server can be used stand alone and you are free to make worker panic and restart. So worker restart feature is here to stay and remove it is not accepted.

@mikhailOK
Copy link
Author

Seems like #255 removed the stop_on_panic method from System.

panic = "abort" is not always usable because it does not run destructors.
std::thread::spawn returns a JoinHandle which allows the code to handle the panic.
Please consider a feature to handle panic behavior.

@fakeshadow
Copy link
Contributor

Sorry that's outside the scope of actix-server.

You can add your own panic handle by using std::panic module or use tokio::spawn which give you a joinhandle.

@mikhailOK
Copy link
Author

Would panic handling functionality make sense for Actor then, considering Actor::start swallows the JoinHandle?
(This is outside of actix-rt, but originally stop_on_panic was here)

@fakeshadow
Copy link
Contributor

Would panic handling functionality make sense for Actor then, considering Actor::start swallows the JoinHandle?
(This is outside of actix-rt, but originally stop_on_panic was here)

actix-rt and actix-server are different crates. And I was mostly explain how you can make actix-server worker panic and cause restart.

I did not make the change to stop_on_panic but since it's removed it would have not effect for sure.
I can re-open the issue if you want it back or feel free to make new issue for talking about that.

@fakeshadow fakeshadow reopened this May 1, 2021
@fakeshadow fakeshadow changed the title System::stop_on_panic has no effect System::stop_on_panic is removed. May 1, 2021
@fakeshadow
Copy link
Contributor

fakeshadow commented May 1, 2021

@mikhailOK I edited the title of issue and re-opened. Hope you don't mind as we need it back to implement actual panic stop.

@fakeshadow fakeshadow removed the bug Something isn't working label May 1, 2021
@PhotonQuantum
Copy link

I walked into the exact problem, and when I ported the deleted snippet in #255 to actix 0.12.0 (here), it worked. Not sure why stop_on_panic is removed.

I also tested this idea on actix-web 4.0.0-beta.8. The program didn't terminate, but new service workers are constructed to accept new connections. I believe this is the expected behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants