Spawn_blocking closures non-deterministically fail when runtime is dropping tasks #4834

ekzhang · 2022-07-13T14:22:40Z

Version v1.19.2 (minimal reproduction)

Platform Darwin Kernel Version 21.5.0 ARM64. I'm running on an M1 Pro processor with 10 logical cores.

Description The documentation for task::spawn_blocking says the following:

Closures spawned using spawn_blocking cannot be cancelled. When you shut down the executor, it will wait indefinitely for all blocking operations to finish.

However, in some cases when adding a spawn_blocking call to the Drop implementation of a structure, the blocking call does not execute.

I tried this code:

use std::{thread, time::Duration};
use tokio::task;
use tokio::time;

struct A;

impl Drop for A {
    fn drop(&mut self) {
        println!("Dropping A");
        // thread::sleep(Duration::from_secs(1));
        task::spawn_blocking(|| {
            println!("Inside A blocking");
            thread::sleep(Duration::from_secs(1));
            println!("finished A blocking");
        });
    }
}

#[tokio::main]
async fn main() {
    let a = A;
    tokio::spawn(async {
        time::sleep(Duration::from_secs(1)).await;
        drop(a);
    });
    println!("finished!");
}

When I run this code, it sometimes runs the blocking closure and sometimes does not. This is inconsistent between successive runs, even without recompiling the code. For example, I just ran it 5 times and pasted the terminal output below: In runs number 1, 2, and 5 below, it doesn't run the blocking closure. In runs 3 and 4, it runs the closure. In no cases does the runtime panic or otherwise show any signs of failure.

$  tokidrop git:(main) ✗ cargo run --release
   Compiling tokidrop v0.1.0 (/Users/ezhang/Documents/temp/tokidrop)
    Finished release [optimized] target(s) in 0.40s
     Running `target/release/tokidrop`
finished!
Dropping A
$  tokidrop git:(main) ✗ cargo run --release
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/tokidrop`
finished!
Dropping A
$  tokidrop git:(main) ✗ cargo run --release
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/tokidrop`
finished!
Dropping A
Inside A blocking
finished A blocking
$  tokidrop git:(main) ✗ cargo run --release
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/tokidrop`
finished!
Dropping A
Inside A blocking
finished A blocking
$  tokidrop git:(main) ✗ cargo run --release
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/tokidrop`
finished!
Dropping A

I expected to see this happen: The executor would schedule and wait indefinitely for the spawn_blocking tasks to finish, as described in the quoted documentation section. Or perhaps panic if this usage of spawn_blocking isn't valid, rather than silently fail? It's also a little bit surprising that the behavior is non-deterministic.

For completeness, here are some variations and the behavior I noticed but don't know how to explain, maybe you find it helpful or not:

If I uncomment line 10 in the snippet above (// thread::sleep(Duration::from_secs(1));), then the spawn_blocking call happens never, rather than sometimes.
If I remove the tokio::spawn call that places A in a runtime task, then A is dropped at the end of the main function instead of by the runtime internally, and the blocking closure always executes to completion, printing Inside A blocking\nfinished A blocking as I would expect.
If I run the original code on Rust playground, I've never been able to get the "inside A blocking" line to print. I'm guessing this has to do with my personal computer having more logical CPUs or being faster which causes the non-determinism?

Thank you!

The text was updated successfully, but these errors were encountered:

Darksonn · 2022-07-13T14:41:35Z

This is fixed by the (currently unreleased) PR #4811.

Darksonn · 2022-07-13T14:43:58Z

(The answer to what's going on is that they can be cancelled if they have not already started running.)

ekzhang · 2022-07-13T14:49:00Z

Thanks for explaining and sharing the really helpful docs! In that case it seems like putting spawn_blocking() in destructors will not guarantee that the code is run when tasks are dropped on runtime shutdown, so I will need to find another way to do cleanup for my use case.

Noah-Kennedy · 2022-07-13T14:52:14Z

@ekzhang what exactly do you need to do in the spawn_blocking for your use case?

cgwalters · 2022-07-13T15:02:44Z

You could just use std::thread::spawn(), no?

ekzhang · 2022-07-13T15:04:12Z

@Noah-Kennedy The basic example is that I was trying to delete a TempDir from the tempfile crate, and I put that code in a spawn_blocking closure since the Drop destructor for TempDir does a blocking file system operation (removing a directory tree recursively).

This is a spawned task and several function calls deep, a tempdir is used in one of the functions. Since the task can be canceled I would like to make sure that the TempDir actually gets deleted.

Another example in my specific case, besides FS resources, is a file system mount (OverlayFS and FUSE), which needs to be unmounted with the umount2 system call or fusermount, as well as managing cgroup resources. All of these are operations that need to be cleaned up consistently or they will leak memory + OS resources.

@cgwalters I don't think thread::spawn() in a destructor will work for this case because it has the same problem: when main exits, it doesn't wait for all threads to join; they get stopped abruptly.

Thanks a lot for the help though!! I understand that this isn't in scope for the executor and can try to write this part of my system without using Tokio.

cgwalters · 2022-07-13T15:11:54Z

The basic example is that I was trying to delete a TempDir from the tempfile crate,

I have an opinion on this one: https://internals.rust-lang.org/t/should-rust-programs-unwind-on-sigint/13800/11

Noah-Kennedy · 2022-07-13T15:22:48Z

Ugh, this is a case which has bitten me before. In a lot of these cases, you can usually get away with performing the calls inline since they are generally fast enough to be effectively non-blocking. That might actually be your best option possibly, combined with block_in_place (although this approach certainly has its problems).

ekzhang added A-tokio Area: The main tokio crate C-bug Category: This is a bug. labels Jul 13, 2022

Darksonn added T-docs Topic: documentation M-task Module: tokio/task labels Jul 13, 2022

Darksonn closed this as completed Jul 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spawn_blocking closures non-deterministically fail when runtime is dropping tasks #4834

Spawn_blocking closures non-deterministically fail when runtime is dropping tasks #4834

ekzhang commented Jul 13, 2022

Darksonn commented Jul 13, 2022

Darksonn commented Jul 13, 2022

ekzhang commented Jul 13, 2022

Noah-Kennedy commented Jul 13, 2022

cgwalters commented Jul 13, 2022

ekzhang commented Jul 13, 2022 •

edited

cgwalters commented Jul 13, 2022

Noah-Kennedy commented Jul 13, 2022

Spawn_blocking closures non-deterministically fail when runtime is dropping tasks #4834

Spawn_blocking closures non-deterministically fail when runtime is dropping tasks #4834

Comments

ekzhang commented Jul 13, 2022

Darksonn commented Jul 13, 2022

Darksonn commented Jul 13, 2022

ekzhang commented Jul 13, 2022

Noah-Kennedy commented Jul 13, 2022

cgwalters commented Jul 13, 2022

ekzhang commented Jul 13, 2022 • edited

cgwalters commented Jul 13, 2022

Noah-Kennedy commented Jul 13, 2022

ekzhang commented Jul 13, 2022 •

edited