Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot share RBTree between multiple threads #47

Open
jynnantonix opened this issue Mar 30, 2020 · 13 comments
Open

Cannot share RBTree between multiple threads #47

jynnantonix opened this issue Mar 30, 2020 · 13 comments

Comments

@jynnantonix
Copy link
Contributor

jynnantonix commented Mar 30, 2020

Hello,

I'm trying to have multiple threads access a shared RBTree instance. Here's a small reproduction of the problem:

use std::sync::{Arc, Mutex};
use std::thread;

use intrusive_collections::{KeyAdapter, intrusive_adapter};
use intrusive_collections::rbtree::*;
use rand::random;

struct MyData {
    link: Link,
    key: u64,
    value: u64,
}

impl MyData {
    fn new(key: u64, value: u64) -> Arc<MyData> {
        Arc::new(MyData {
            link: Link::new(),
            key,
            value,
        })
    }
}

intrusive_adapter!(MyAdapter = Arc<MyData>: MyData { link: Link });

impl<'a> KeyAdapter<'a> for MyAdapter {
    type Key = u64;
    fn get_key(&self, data: &'a MyData) -> Self::Key {
        data.key
    }
}


fn main() {
    let tree = Arc::new(Mutex::new(RBTree::new(MyAdapter::new())));
    for _ in 0..30 {
        let key = random();
        let value = random();
        tree.lock().unwrap().entry(&key).or_insert_with(|| MyData::new(key, value));
    }

    const THREADS: usize = 7;
    let mut threads = Vec::with_capacity(THREADS);
    for _ in 0..THREADS {
        let tree = tree.clone();
        threads.push(thread::spawn(move || {
            for _ in 0..10 {
                let k = random();
                if let Some(data) = tree.lock().unwrap().find(&k).clone_pointer() {
                    println!("found key = {}, value = {}", data.key, data.value);
                }
            }
        }));
    }

    for t in threads.into_iter() {
        t.join().unwrap();
    }
}

This fails to compile with:

error[E0277]: `std::cell::Cell<intrusive_collections::rbtree::NodePtr>` cannot be shared between threads safely
   --> src/intrusive.rs:46:22
    |
46  |         threads.push(thread::spawn(move || {
    |                      ^^^^^^^^^^^^^ `std::cell::Cell<intrusive_collections::rbtree::NodePtr>` cannot be shared between threads safely
    |
    = help: within `MyData`, the trait `std::marker::Sync` is not implemented for `std::cell::Cell<intrusive_collections::rbtree::NodePtr>`
    = note: required because it appears within the type `intrusive_collections::rbtree::Link`
    = note: required because it appears within the type `MyData`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<MyData>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `intrusive_collections::rbtree::RBTree<MyAdapter>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Mutex<intrusive_collections::rbtree::RBTree<MyAdapter>>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<std::sync::Mutex<intrusive_collections::rbtree::RBTree<MyAdapter>>>`
    = note: required because it appears within the type `[closure@src/intrusive.rs:46:36: 53:10 tree:std::sync::Arc<std::sync::Mutex<intrusive_collections::rbtree::RBTree<MyAdapter>>>]`

error[E0277]: `std::cell::Cell<usize>` cannot be shared between threads safely
   --> src/intrusive.rs:46:22
    |
46  |         threads.push(thread::spawn(move || {
    |                      ^^^^^^^^^^^^^ `std::cell::Cell<usize>` cannot be shared between threads safely
    |
    = help: within `MyData`, the trait `std::marker::Sync` is not implemented for `std::cell::Cell<usize>`
    = note: required because it appears within the type `intrusive_collections::rbtree::Link`
    = note: required because it appears within the type `MyData`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<MyData>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `intrusive_collections::rbtree::RBTree<MyAdapter>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Mutex<intrusive_collections::rbtree::RBTree<MyAdapter>>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<std::sync::Mutex<intrusive_collections::rbtree::RBTree<MyAdapter>>>`
    = note: required because it appears within the type `[closure@src/intrusive.rs:46:36: 53:10 tree:std::sync::Arc<std::sync::Mutex<intrusive_collections::rbtree::RBTree<MyAdapter>>>]`

This appears to be because rbtree::Link implements Send but not Sync. I tried to work around this by using Arc<Mutex<MyData>> but then I got stuck on the implementation of MyAdapter: implementing get_link requires you to acquire the lock but there's no way to ensure that the lock is held (and then released) while the link is mutated.

I wonder if it's actually fine to implement Sync for rbtree::Link. As far as I can tell every safe method that modifies a Link requires a mutable reference to the tree first. I think it might require changing the is_linked method to be unsafe though.

Is there something I'm just missing to make this work?

@Amanieu
Copy link
Owner

Amanieu commented Mar 30, 2020

This is currently not possible because RBTree relies on is_linked for safety: it must check if an element is already linked before it can be inserted. Otherwise you could end up with two threads inserting the same element at the same time into different trees.

@jynnantonix
Copy link
Contributor Author

What about something like an AtomicLink? Then users who want multi-thread support can opt into it by using AtomicLink while everyone else keeps using the regular Link.

@Amanieu
Copy link
Owner

Amanieu commented Mar 30, 2020

Sure that would be possible. However you'll need to wait for #46 to land since custom Link types are not yet supported.

@glebpom
Copy link

glebpom commented Apr 9, 2020

Hi. It looks like Generic Reference Types are now merged. Would it be possible to share some example on how to make intrusive-collections thread-safe?

@Amanieu
Copy link
Owner

Amanieu commented Apr 9, 2020

You need to create an AtomicLink type and implement DefaultLinkOps for it. Then create an AtomicLinkOps empty struct and implement LinkOps and rbtree::LinkOps for it.

@glebpom
Copy link

glebpom commented Apr 9, 2020

I think I made it work. Please take a look here. Does it look correct? I'm not sure if it's ok to have two separate AtomicCells in AtomicLink, or they should be protected by a single Sync primitive? Can this approach introduce some sort of race-condition? Thanks.

@Amanieu
Copy link
Owner

Amanieu commented Apr 10, 2020

Ideally it should look like this:

struct AtomicLink {
    locked: AtomicBool,
    next: Cell<Option<NonNull<AtomicLink>>>,
    prev: Cell<Option<NonNull<AtomicLink>>>,
}
unsafe impl Sync for AtomicLink {}

The key point is that the link is locked when it is inserted into the intrusive collection and unlocked when it is removed. When the lock is held, accessing prev and next does not require any synchronization.

However our current LinkOps API isn't currently suited for that. I will make a PR later today to improve it.

@glebpom
Copy link

glebpom commented Apr 13, 2020

@Amanieu does it look correct now?

unsafe impl LinkOps for AtomicLinkOps {
    type LinkPtr = NonNull<AtomicLink>;

    #[inline]
    unsafe fn acquire_link(&mut self, ptr: Self::LinkPtr) -> bool {
        let r = ptr.as_ref();

        if r.locked.swap(true, Ordering::SeqCst) {
            // already locked
            return false;
        }

        assert!(!r.is_linked());

        r.next.set(None);

        true

    }

    #[inline]
    unsafe fn release_link(&mut self, ptr: Self::LinkPtr) {
        let r = ptr.as_ref();

        r.next.set(UNLINKED_MARKER);

        assert!(r.locked.swap(false, Ordering::SeqCst));
    }
}

@Amanieu
Copy link
Owner

Amanieu commented Apr 14, 2020

  • You can use Ordering::Acquire and Ordering::Release.
  • is_linked should return locked.load(Ordering::Relaxed).
  • You don't need to set next to None since you're already tracking the link state through locked.

@glebpom
Copy link

glebpom commented Apr 14, 2020

  • Release/Acquire becomes Relaxed for load in combined operations. Did I get it right, that you suggest using Acquire in swap in acquire_link, and Release in swap in release_link? Would AcqRel be the right option here?
  • Is there a real requirement to check if the link was actually locked in release_link. If not, can we just use store(false, Ordering::Relaxed)?
  • I'm wondering what are the consequences of using Relaxed in is_linked? I see that is_linked is used only in Debug implementation, and it's not the part of any trait implementation. Wouldn't it make sense to use SeqCst is is_linked and Relaxed load in Debug implementation?
  • Is it required to set UNLINKED_MARKER in release_link in this case? Seems like it's not required

@Amanieu
Copy link
Owner

Amanieu commented Apr 14, 2020

  • Release/Acquire becomes Relaxed for load in combined operations. Did I get it right, that you suggest using Acquire in swap in acquire_link, and Release in swap in release_link? Would AcqRel be the right option here?
  • Is there a real requirement to check if the link was actually locked in release_link. If not, can we just use store(false, Ordering::Relaxed)?

You should use swap(true, Acquire) in acquire_link and store(false, Release) in release_link.

  • I'm wondering what are the consequences of using Relaxed in is_linked? I see that is_linked is used only in Debug implementation, and it's not the part of any trait implementation. Wouldn't it make sense to use SeqCst is is_linked and Relaxed load in Debug implementation?

I guess you can leave is_linked as SeqCst since it is not used in the hot path.

  • Is it required to set UNLINKED_MARKER in release_link in this case? Seems like it's not required.

No it's not required since the link state is tracked separately in locked.

@glebpom
Copy link

glebpom commented Apr 14, 2020

Thanks. Updated the gist with your recommendations. Would you like the PR with AtomicLink implemented?

@Amanieu
Copy link
Owner

Amanieu commented Apr 14, 2020

Thanks but I don't think it is a good fit for the intrusive-collections crate. It is better for you to keep the implementation separate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants