Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[strict provenance] Close The Gap Between AtomicUsize and AtomicPtr #95492

Closed
Gankra opened this issue Mar 30, 2022 · 13 comments · Fixed by #96935
Closed

[strict provenance] Close The Gap Between AtomicUsize and AtomicPtr #95492

Gankra opened this issue Mar 30, 2022 · 13 comments · Fixed by #96935
Labels
A-strict-provenance Area: Strict provenance for raw pointers T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@Gankra
Copy link
Contributor

Gankra commented Mar 30, 2022

This issue is part of the Strict Provenance Experiment - #95228

As of this writing, AtomicUsize has many more operations than AtomicPtr, like fetch_add, which is a problem if we want people to conform to strict provenance and admit when pointers are pointers.

All the cases I found in my initial pass over std and rustc seemed to work fine with CAS operations, so closing the gap is seemingly only needed for the wider ecosystem. For instance, parking_lot does many Cute Tricks with fetch_and and fetch_sub:

https://github.com/Amanieu/parking_lot/blob/035ecafda05588191e03874c1f21cd2b638e6149/core/src/word_lock.rs#L274

See also @jrtc27's comments on what CHERI does

Morally, most atomic operations can be defined in terms of wrapping_offset, so I think it should be fine to support these operations. However it is interesting as to whether they should be defined to operate on "just the address" (as you would expect of wrapping_offset) or if they are allowed to "interact" with the raw bit representation and therefore things that are "not" addresses (e.g. the CHERI metadata). At worst we can say "we can't stop you, but if you do, you will be sad", but it would be good to know if we can do "better".

(For those unaware, a pointer ("capability") on CHERI is 128-bit, but the address-space is ~64-bit. The pointer is a complicated compressed thing that does lots of Tricks to pack in at least an entire slice (so two more logically-64-bit addresses!) into there. If you do Sufficiently Bad things the hardware will mark the memory as ~corrupt and fault you if you try to use the pointer to read/write.)

@Gankra Gankra added A-strict-provenance Area: Strict provenance for raw pointers T-libs Relevant to the library team, which will review and decide on the PR/issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 30, 2022
@Gankra
Copy link
Contributor Author

Gankra commented Mar 30, 2022

(marked as both T-libs and T-compiler because I don't know if this is "just" a matter of using existing intrinsics or if the compiler needs to do some more work to do this)

@thomcc
Copy link
Member

thomcc commented Mar 31, 2022

I have time and motivation to take this this weekend, but a few questions.

I'll probably just go ahead with it and make Choices if I don't get any good answers, but for discussion I'll bring it up now.

I imagine the API will look something like this:

impl<T> AtomicPtr<T> {
    // These are useful, but is `*mut T` the right argument type? (question 1)
    pub fn fetch_or(&self, a: *mut T) -> *mut T;
    pub fn fetch_and(&self, a:  *mut T) -> *mut T;

    // (This one is clearly cursed, but is actually useful for doing a bit-flip
    // on one of the low bits on an pointer, but perhaps it will just tempt
    // people to make xor lists 😢).
    pub fn fetch_xor(&self, a:  *mut T) -> *mut T;

    // Are these coherent? (question 2)
    pub fn fetch_nand(&self, a:  *mut T) -> *mut T;
    pub fn fetch_min(&self, a:  *mut T) -> *mut T;
    pub fn fetch_max(&self, a:  *mut T) -> *mut T;

    // For these, what are the units of `a`? (question 3)
    pub fn fetch_wrapping_add(&self, a: usize) -> *mut T;
    pub fn fetch_wrapping_sub(&self, a: usize) -> *mut T;
    pub fn fetch_wrapping_offset(&self, a: isize) -> *mut T;
}

This brings up a few questions, which is why before I said the design space here is nontrivial:

  1. Regarding fetch_or, it's clear to me that *mut T is the right argument type, because you may want to | in a pointer, this allows preserving its provenance. But for fetch_and and fetch_xor it's tricky.

    1b: Note that parking_lot_core wants to use fetch_and for a bit-clear operation: https://github.com/Amanieu/parking_lot/blob/035ecafda05588191e03874c1f21cd2b638e6149/core/src/word_lock.rs#L274. How can we express this? Does it need a fetch_and_not1?

  2. Are fetch_nand, fetch_min, and fetch_max coherent operations?

  3. What should the units be for fetch_wrapping_{add,sub,offset}?

    Symmetry with ptr says "units of size_of::<T>(), but for essentially all uses I know of, it would be more useful for them to be byte offsets.

    Note that unlike with raw pointers, we can't tell people to change the type of the AtomicPtr easily. (I don't think we should encourage things like transmuting a AtomicPtr<T> to an AtomicPtr<u8> in order to perform a cast, but maybe we are okay with this code just using AtomicPtr<u8>).

Footnotes

  1. Note that unless I'm off my rocker, this is not fetch_nand, since it would be a &!b rather than !(a & b).

@Lokathor
Copy link
Contributor

Lokathor commented Mar 31, 2022

small note about the footnote: andnot is usually
(!a) & b

@thomcc
Copy link
Member

thomcc commented Mar 31, 2022

@RalfJung
Copy link
Member

parking_lot_core wants to use fetch_and for a bit-clear operation

That just clears a single bit -- at least for Miri there is no issue here.

No idea how to make any sense of any of this on CHERI though. Cc @jrtc27
What we would want is that this is equivalent to atomically doing cheri_address_set with the old address bit-combined with the operand.

@jrtc27
Copy link

jrtc27 commented Mar 31, 2022

parking_lot_core wants to use fetch_and for a bit-clear operation

That just clears a single bit -- at least for Miri there is no issue here.

No idea how to make any sense of any of this on CHERI though. Cc @jrtc27 What we would want is that this is equivalent to atomically doing cheri_address_set with the old address bit-combined with the operand.

That's what you get on CHERI

@thomcc
Copy link
Member

thomcc commented Mar 31, 2022

I think I'm probably going to have to ignore cheri for this work, since we don't really have a pointer-sized integer except for usize. That said, it doesn't need to stabilize with urgency.

@workingjubilee
Copy link
Contributor

If we adopt a more "usize is size_t or ptraddr_t" model then it would be fine using usize as the operand. If we don't, then uhhh, a lot of things become much more "funny" already. An appropriate // if xyz { FIXME: note should suffice.

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2022

That's what you get on CHERI

So I can take a 64bit integer and atomically bit-and it with a pointer (that has 128bit), meaning "bit-and this with the 64bit address part of the pointer please"? Other people say CHERI atomics are pointer-sized so I am confused now.

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2022

In Rust syntax, we need an operation that atomically does

// ptr is `&AtomicPtr`, mask is `usize`
let v = *ptr;
*ptr = v.map_addr(|addr| addr & mask):

@jrtc27
Copy link

jrtc27 commented Apr 1, 2022

In Rust syntax, we need an operation that atomically does

// ptr is `&AtomicPtr`, mask is `usize`
let v = *ptr;
*ptr = v.map_addr(|addr| addr & mask):

That's exactly what we give you, semantically

@Diggsey
Copy link
Contributor

Diggsey commented Apr 5, 2022

pub fn fetch_or(&self, a: *mut T) -> *mut T;

I think this signature is a problem, because it's not clear what provenance the resulting pointer has. Does the resulting provenance come from self or does the provenance come from a?

Strictly speaking we'd need two variants of each binary atomic operation: one which takes provenance from the left, and one which takes provenance from the right.

@RalfJung
Copy link
Member

RalfJung commented Apr 5, 2022

Indeed, the signatures will all be changed to take usize as the second operand.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jul 6, 2022
…lnay

Allow arithmetic and certain bitwise ops on AtomicPtr

This is mainly to support migrating from `AtomicUsize`, for the strict provenance experiment.

This is a pretty dubious set of APIs, but it should be sufficient to allow code that's using `AtomicUsize` to manipulate a tagged pointer atomically. It's under a new feature gate, `#![feature(strict_provenance_atomic_ptr)]`, but I'm not sure if it needs its own tracking issue. I'm happy to make one, but it's not clear that it's needed.

I'm unsure if it needs changes in the various non-LLVM backends. Because we just cast things to integers anyway (and were already doing so), I doubt it.

API change proposal: rust-lang/libs-team#60

Fixes rust-lang#95492
@bors bors closed this as completed in 2f872af Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-strict-provenance Area: Strict provenance for raw pointers T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants