Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] core::marker::Freeze in bounds #3633

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

p-avital
Copy link

@p-avital p-avital commented May 10, 2024

@RalfJung
Copy link
Member

IIRC @joshlf was also asking about exposing Freeze; maybe they can help provide some more motivating examples. Currently there's only really one. (I don't understand the "key of a map" note, and the RFC doesn't explain it in more than 5 words either.)

@clarfonthey
Copy link
Contributor

The main reason for not stabilising Freeze was because it would now add an additional burden for API contracts: adding interior mutability could be considered a breaking change for some APIs, and thus people might want to add "phantom cells" of UnsafeCell<()> inside their types to ensure that they have this option for the future.

I don't think that just stabilising it for bounds affects this concern. The issue isn't the trait itself being exposed in the library, but APIs having to worry about whether they are Freeze or not as an API contract.

@RalfJung
Copy link
Member

RalfJung commented May 10, 2024

I don't think that just stabilising it for bounds affects this concern. The issue isn't the trait itself being exposed in the library, but APIs having to worry about whether they are Freeze or not as an API contract.

A bound is exactly what one is worried about here? If one writes a function fn myfunc(x: impl Freeze), then if I pass a value I got some some other crate to myfunc this will work off that type is Freeze -- meaning it's a breaking change for that crate to add non-Freeze fields.

This is exactly the same as Send and Sync. How often to people defensively add PhantomData<*mut ()> to make their type !Semd + !Sync?

Fundamentally there are some things the compiler only lets you do with Freeze types, and people naturally want to do these things in generic code, and they need Freeze bounds for that. I'm honestly surprised that we went so long without allowing these bounds.^^

(Or did I misunderstand what you mean? It sounded a bit like you're saying just stabilizing the bound means we don't have to worry about the concern. But upon re-reading I am less sure.)

The main reason for not stabilising Freeze was because it would now add an additional burden for API contracts

Letting people write unsafe impl Freeze for Type is even worse, given its role in the operational semantics, so I think we disagree on what the main reason was. ;)

@joshlf
Copy link
Contributor

joshlf commented May 10, 2024

IIRC @joshlf was also asking about exposing Freeze; maybe they can help provide some more motivating examples. Currently there's only really one. (I don't understand the "key of a map" note, and the RFC doesn't explain it in more than 5 words either.)

Sure!

If you want to dive deeper, look at uses of Immutable in zerocopy's 0.8 alpha docs.

On zerocopy stable, we provide traits like FromBytes. T: FromBytes guarantees that transmute::<[u8; size_of::<T>()], T>(bytes) is sound regardless of the value of bytes. We also want to support reference transmutations (e.g., &[u8; N] to &T). Reference transmutations are necessarily more restrictive than value transmutations since they must ban interior mutation. For example, &[u8; N] to &Cell<U> is unsound because it would allow the referent bytes to be mutated while a &[u8; N] reference exists referencing the same memory. For this reason, T: FromBytes also requires that T contains no UnsafeCells.

However, this is overly-restrictive. The "no UnsafeCells" requirement is only relevant to reference transmutations, but not to value transmutations. E.g., transmute::<[u8; size_of::<T>()], UnsafeCell<T>>(bytes) can be sound (depending on T), but we can never implement FromBytes for UnsafeCell<T>. Our API supports value transmutations in places like FromBytes::read_from and transmute!. APIs like these are made less powerful because of this restriction.

Another restriction is that some authors want to use FromBytes as a bound to justify their own unsafe code blocks. For example, google/zerocopy#251 was motivated by a user who wanted to derive FromBytes on a type containing an UnsafeCell for this purpose (see the "Motivation" section of that issue; cc @korran).

Our solution in the upcoming zerocopy 0.8 is to add a separate Immutable trait that is semantically very similar to Freeze*. We require T: Immutable where reference transmutations are happening, but don't require it where mutable reference transmutations (e.g. &mut [u8; N] to &mut T) or value transmutations are happening.


* Currently, Immutable bans UnsafeCells recursively, even via indirection. We will likely lift this restriction, at which point Immutable will be semantically identical to Freeze.

@clarfonthey
Copy link
Contributor

Letting people write unsafe impl Freeze for Type is even worse, given its role in the operational semantics, so I think we disagree on what the main reason was. ;)

I should clarify, what I meant here was the adding of stuff like UnsafeCell<()> to make your type !Freeze, so that there's the option to add real interior mutability later without breaking existing APIs that rely on the type being Freeze.

Not the ability to make something Freeze despite interior mutability.

@ehuss ehuss added T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. labels May 10, 2024
@zachs18
Copy link

zachs18 commented May 10, 2024

For another "motivating example", bytemuck is in basically the same position as zerocopy, in that some of the bounds are currently more strict than necessary (discussion). I have a branch with a sketch of bytemuck's API in the presence of core::mem::Freeze. (cc @Lokathor)


One possible alternative that would "work around" the semver issue would be to have Freeze be a "normal" (non-auto) trait that, like Copy, can only be implemented for types where all of their fields are also Freeze. This would require types to "opt-in" to guaranteeing they do not have interior mutability, just like types now have to "opt-in" to being Copyable. I don't necessarily think this option is better than the status quo1, but it is something to consider.

Footnotes

  1. since Freeze can be somewhat observed on stable via static-promotion, and this would either break that until libraries add impl Freeze for MyPODType, or require a second BikeshedRawFreeze actually-auto-trait as a supertrait of Freeze for the compiler to use for determining static-promotion etc.

@RalfJung
Copy link
Member

Yeah that could work -- as you say, the existing Freeze trait has to stay around; we'd also have to make static promotion accept a type that's either AutoFreeze or OptInFreeze for the motivating example to work (which is all about static promotion of data at a generic type).

@RalfJung RalfJung changed the title Stabilization RFC for core::marker::Freeze in bounds [RFC] core::marker::Freeze in bounds May 11, 2024
@BurntSushi
Copy link
Member

Thanks for writing this up. I appreciate the effort that went into it and the desire to push things forward.

But... I think this RFC needs a fair bit of work. After reading it, I'm still left with some very fundamental questions (and I expect these to be answered in the RFC):

  • What is the Freeze trait used for? How will Rust programmers use it?
  • What is an example of a typical use of the Freeze trait in Rust programs?
  • What would happen if we didn't stabilize Freeze?
  • The guide level explanation doesn't say anything about what Freeze is. The guide level section should not be a stub.
  • The reference level explanation should be reasonably self-contained and explain things in a fair bit of detail. I also have to say that I followed the links and they did not help me to understand what Freeze is.
  • There is zero mention of the fact that this is a new marker trait in the drawbacks section. Marker traits present significant annoyances. This needs more discussion and an exploration for why a new marker trait is really truly warranted.
  • I don't understand why we also aren't trying to stabilize unsafe impl Freeze. The RFC just says it's orthogonal, and while that may be true, that isn't a good enough reason to split them apart. The default position should be to stabilize both, and only after some compelling justification to do otherwise should we try to split it apart. If we're splitting them apart, for example, there should be an exploration of the drawbacks of doing that. For example, I sometimes need to write unsafe impl Send for MyType. What happens when I need to do that for Freeze but can't? (And if that can't happen or is unlikely to happen, then that is a very interesting difference from other marker traits that should have discussion about it in the RFC.)

@joshlf
Copy link
Contributor

joshlf commented May 11, 2024

  • I don't understand why we also aren't trying to stabilize unsafe impl Freeze.

IIUC, we couldn't do that today since Freeze promises that a type contains no UnsafeCells. That property is entirely visible to the compiler - either you have no UnsafeCells, in which case the compiler knows your type is Freeze, or you have some, in which case it would be unsound to implement Freeze for your type. We'd need to relax it to simply say "doesn't permit interior mutation" or something to that effect, in which case it'd be more about whether any of your public API permits interior mutation (so it's more of a runtime property and less of a type property).

If we decide to keep it a type property - ie, the absence of UnsafeCells - then we could do impl Freeze rather than unsafe impl Freeze per @zachs18:

One possible alternative that would "work around" the semver issue would be to have Freeze be a "normal" (non-auto) trait that, like Copy, can only be implemented for types where all of their fields are also Freeze. This would require types to "opt-in" to guaranteeing they do not have interior mutability, just like types now have to "opt-in" to being Copyable. I don't necessarily think this option is better than the status quo1, but it is something to consider.

Footnotes

  1. since Freeze can be somewhat observed on stable via static-promotion, and this would either break that until libraries add impl Freeze for MyPODType, or require a second BikeshedRawFreeze actually-auto-trait as a supertrait of Freeze for the compiler to use for determining static-promotion etc.

@RalfJung
Copy link
Member

RalfJung commented May 11, 2024

Freeze is used by the compiler (codegen, const-check) and Miri to determine whether a type allows shared mutation. We definitely don't want to allow people to disallow shared mutation on their UnsafeCell-carrying types -- I don't see when that would ever be useful, it opens so many questions, and it's not what anyone asked for. (That makes Freeze quite different from Send and Sync: Send and Sync are not used by codegen or Miri to do any optimizations, they are basically entirely library concepts with just a bit of involvement in the compiler to handle static. Freeze is very much a language/opsem concept.)

So if we want to expose the Freeze trait that exists, that decides for codegen and Miri and const-checking whether there's any interior mutability, then we can't allow impls.

@ChayimFriedman2
Copy link

@joshlf For the purpose of zerocopy (and bytemuck), what advantages does exposing Freeze brings over a custom trait? You will have to derive it anyway because you have other required traits, right?

@zachs18
Copy link

zachs18 commented May 11, 2024

For the purpose of zerocopy (and bytemuck), what advantages does exposing Freeze brings over a custom trait?

IMO the main advantage is that Freeze can be definitively checked by the compiler. Under the current Freeze as an auto trait, no derives or impls are needed (or possible) for it since it is automatically implemented by the compiler when it can be. 1

An additional advantage is that there would be only one trait to encode the invariant. bytemuck and zerocopy (and any other cratr) could each implement their own Immutable trait, but with Freeze in the stdlib this would not be necessary.

You will have to derive it anyway because you have other required traits, right?

For my sketch of bytemuck's API with Freeze, Freeze is simply added as an additional bound on the functions where it is needed, it is not a supertrait or subtrait of any bytemuck trait itself.

Footnotes

  1. Also, even a hypothetical Copy-like "opt-in-but-still-validated-impls" version of Freeze would still be easier to correctly implement since an incorrect implementation simply wouldn't compile (just like impl Copy for struct Wrapper(String) doesn't compile today).

@joshlf
Copy link
Contributor

joshlf commented May 11, 2024

@joshlf For the purpose of zerocopy (and bytemuck), what advantages does exposing Freeze brings over a custom trait? You will have to derive it anyway because you have other required traits, right?

The most important reason is that the compiler can see into the internals of types in the standard library. For example, I filed this extremely silly issue because there's technically no guarantee provided to code outside of the standard library that Box<T> doesn't contain any UnsafeCells. That's obviously true, but in order for zerocopy to uphold its soundness policy, we need an explicit guarantee. Such a guarantee would be useful to basically nobody but us. By contrast, Box already implements Freeze, so if we could use Freeze directly, we wouldn't have to quibble about things like that. Here's another PR that would be at least partially obviated by Freeze.

Besides that, here are some other reasons:

  • All of the normal advantages of not having to own something like this. For example, zerocopy and bytemuck - if we were both to implement the same idea - would duplicate code between the two of us.
  • We have to manually implement Immutable for a huge number of types, all of which is extra code we have to write and maintain (this file contains the keyword Immutable 159 times!). Worse, since it's an unsafe trait, it's also safety proofs we have to write and maintain. Maintaining safety proofs is a giant headache since there's no programmatic way to discover that your safety proof isn't valid anymore since it's just prose, so we do a lot of manual work to make sure our safety proofs are forwards-compatible (this relates to the soundness policy I mentioned before). Often a single safety comment will be blocked for weeks or months while we try to get guarantees landed in the language documentation.
  • The custom derives, while not complicated by custom derive standards, are much uglier than the equivalent in-compiler implementation since the latter has direct access to a more natural representation of a type's fields and their types. We have to hack it by adding clumsy where bounds and such.
  • Proc macros are slow to compile and slow to execute, while the compiler-supported implementation will presumably be much faster
  • The compiler might support fancier reasoning at some point, e.g. permitting UnsafeCell<()>. Supporting this natively in zerocopy would be very difficult. This particular example may be contrived, but my point is that, in general, the compiler can support much richer analysis that would require feats of code gymnastics to pull off outside the compiler.

@p-avital
Copy link
Author

Thanks to everyone taking an interest in this RFC.

First, a quick apology for it starting up so half-assed, I clearly underestimated the task. I started this RFC as a knee-jerk reaction to 1.78 merging the breaking change the RFC mentions without providing a way for const references to generics to exist.

I'm still committed to getting this to move forward, but I have limited availability for the next 2 weeks, so please bear with me as I rewrite this entire thing to the standard I should have started it with.

I would really appreciate getting some early feedback on the direction people around here would like to see this RFC take regarding:

  • Should this RFC target the existing core::marker::Freeze or propose an alternative (let's strawman it as Frieza) with similar purpose?
    • A personal priority I have for this RFC is allowing const references to generics in stable Rust again. Frieza: core::marker::Freeze is my highest priority.
  • Recursivity:
    • core::marker::Freeze only accounts for "local" immutability. This is good enough for certain purposes (such as static promotion), but not for stuff like maps that would like to ensure their keys are totally immutable.
    • Would your application benefit from Frieza being recursive? Would it suffer from it being so?
  • Auto-trait-iness: do you feel happy/neutral/unhappy about Frieza being an auto-trait?
    • If the consensus is that more auto-traits=bad, creating Frieza seems like the right approach to me.
    • Otherwise, we could just expose core::marker::Freeze, and a recursive equivalent could be built either inside or outside core later.

Sorry for my high latency for the beginnings of this RFC. I really appreciate all the feedback you can give, and would like to mention that PRs to this PR's branch are very welcome :)

@RalfJung
Copy link
Member

RalfJung commented May 13, 2024

@p-avital happy to hear that you're not discouraged by all the extra work we're throwing your way. :)

core::marker::Freeze only accounts for "local" immutability. This is good enough for certain purposes (such as static promotion), but not for stuff like maps that would like to ensure their keys are totally immutable.

Even if Freeze was recursive, types can use raw pointers or global static to implement shared mutable state. So I don't think this RFC should try to do anything in that direction -- between promotion of generic consts, zerocopy, and bytemuck we have three users that all only care about "shallow" immutability. A map can use an unsafe trait PartialEqPure or so if a proof is required that partial_eq (and/or other functions) are pure.

@BurntSushi
Copy link
Member

@p-avital It's hard for me to give you good feedback here because I think I lack a fair bit of context. Most of the discussion (and the RFC itself) seems to be very high context here, and it's context that I don't have.

With that said, I can say that adding a new auto/marker trait is a Very Big Deal. It's not just that "auto traits = bad," but that adding a new one comes with very significant trade offs. That doesn't mean that adding one is impossible, but it does mean, IMO, that adding one needs to come with significant benefits. Or that it's the only path forward for some feature that most everyone deems to be essential.

From the discussion, it sounds like there are alternatives. So those alternatives, at minimum, should be explored.

@RalfJung
Copy link
Member

@BurntSushi note that the Freeze marker trait, albeit being unstable, is already a semver hazard. Whenever I have a constant of type T, there are things the compiler will let me do only if T: Freeze. So, while new auto traits are a Very Big Deal, this is not an entirely new auto trait, and it merely extends the existing semver hazard to also affect crates that do not allow creating const values of their type. But any crate that has a const fn new() -> Self already needs to worry about Freeze, in today's Rust.

Here's a demonstration.

@BurntSushi
Copy link
Member

@RalfJung Yikes. OK. Thank you for pointing that out. That's exactly the kind of context I was missing.

@clarfonthey
Copy link
Contributor

Oh, fair point. I guess that Send and Sync also break with that as well.

@RalfJung
Copy link
Member

Doesn't UnsafeCell<()> have the same effect as PhantomNotFreeze, admittedly with less ergonomics? Or have I been suggesting that incorrectly so far?

It currently does, but with some core questions around UnsafeCell still being open, maybe we want to provide something more explicit.

@RalfJung
Copy link
Member

Reasonable optin for "disable all auto traits", I suppose.

And also disable inference of variance?
Can we then also allow unused type and lifetime parameters, please? ;)

But all of that seems like outside the scope of a Freeze RFC.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

[Should the trait be exposed under a different name?](https://github.com/rust-lang/rust/pull/121501#issuecomment-1962900148)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that seems troublesome about Freeze is that it's a trait named using a verb, which is good in isolation, but that verb would usually be an operation you can perform on values of the implementing type. What's the “freeze” operation on T: Freeze? I suppose it's taking a & reference to T. That's reasonable enough, but there is another “freeze” operation that's both already named in LLVM and recently been proposed for addition to Rust: “freezing” uninitialized memory to enable safely reading the bytes of a value that contains padding, or reading a value that may be partially initialized. That's a different thing in the same problem domain (what you can or can't do with the bytes when you have a T or &T) which seems dangerously confusing. And outside of that name collision, “Freeze” doesn't convey much to the naïve reader.

Unfortunately, I don't have any good ideas for alternatives, especially not non-negated ones as @RalfJung suggested, but I think that stabilizing this under the name Freeze would create a frequent source of confusion among those encountering it for the first time. It might be better to have an awkward complex name than a cryptic simple one. Awkward suggestion: ShallowImmut?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about Stuck? that is unused afaict so people seeing it would think to look up what it means in Rust. It's also nice and short and reminds users of what it means.

Copy link
Member

@RalfJung RalfJung May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least on the theory side, a program being "stuck" has meaning already -- it means that execution can't progress further as the program has left the realm of well-defined executions. I would find this term very confusing.

"Frozen" is better than "Freeze" as it's not an action.

Unfortunately, I don't have any good ideas for alternatives, especially not rust-lang/rust#121501 (comment), but I think that stabilizing this under the name Freeze would create a frequent source of confusion among those encountering it for the first time.

Maybe we shouldn't call that "Freeze"? It's not a great name as what the operation really does is turn poison into non-deterministic regular bytes. Rust doesn't have undef-style "bytes that are unstable".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least on the theory side, a program being "stuck" has meaning already -- it means that execution can't progress further as the program has left the realm of well-defined executions. I would find this term very confusing.

I was thinking more of that the value in memory is stuck at whatever bytes are put into .rodata. because Stuck is a trait and applies to types, not program execution, it seems different enough to me that it won't be that confusing.

Maybe we shouldn't call that "Freeze"? It's not a great name as what the operation really does is turn poison into non-deterministic regular bytes. Rust doesn't have undef-style "bytes that are unstable".

since the LLVM IR op is already called freeze, they've effectively already laid claim to that name, so anyone who's used LLVM IR and tries to figure out rust will be confused if we use freeze to mean something completely different where it could be confused with an operation (since freeze is present-tense, unlike stuck where being past-tense and being applied to types makes it much less likely to be an operation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more of that the value in memory is stuck at whatever bytes are put into .rodata

That's far from it, though. Freeze isnt just about rodata, it's about all shared references without interior mutability.

I find "stuck" extremely unintuitive. It conveys negative feelings to me, "we're stuck somewhere or with something (and want to get unstuck again)". I don't think it is a good term for "this memory is immutable".

since the LLVM IR op is already called freeze, they've effectively already laid claim to that name, so anyone who's used LLVM IR and tries to figure out rust will be confused if we use freeze to mean something completely different where it could be confused with an operation (since freeze is present-tense, unlike stuck where being past-tense and being applied to types makes it much less likely to be an operation).

LLVM also uses poison/undef, neither of which we use in Rust (we call it "uninitialized memory" or just "uninit"). LLVM calls it getelementptr or ptraddr, we call it offset. LLVM calls it load/store, we call it read/write. The list can probably go on for a while. I don't think we should let LLVM dictate our terminology.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In zerocopy, we went with Immutable, which I quite like (credit to @jswrenn's coworker for suggesting the name). It is an adjective and accurately describes the semantics - T: Immutable means that T cannot be mutated behind a shared reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but T can be mutated behind a shared reference if that mutation is hidden in some other block of memory: Box<Cell<u8>> does implement Freeze, but can be mutated behind a shared reference. therefore I think using Immutable would be highly misleading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably any name that doesn't clarify that the property only applies shallowly/not-via-indirection is similarly misleading. If we're worried about that, we'd need to add an adjective like Shallow to the name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just invent another term that no one else used. Freeze itself is an invented term anyway.

Perhaps Solid (you "freeze" liquid to get "solid" 🤷).
Or Acellular (technically a negated word).

But given that the trait has existed for so long under the name Freeze perhaps it has already gotten traction

Though IMO it's fine to actually use a multi-word term like ShallowImmutable or CellFree? We already have got the the two-word UnwindSafe and three-word RefUnwindSafe auto traits in std. As long as this trait bound is not needed very often (unlike Send and Sync and Sized) using a longer but more precise name shouldn't be a problem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me set a record for the most negative name: UnindirectedUnUnsafeCelled 😄

I like ShallowImmutable a lot, as it's rather descriptive and attracts curiosity thanks to the Shallow term.

Do we have any opposition to ShallowImmutable?

@RalfJung
Copy link
Member

RalfJung commented May 17, 2024

(moved into the right thread)

text/0000-stabilize-marker-freeze.md Outdated Show resolved Hide resolved
text/0000-stabilize-marker-freeze.md Outdated Show resolved Hide resolved
@nikomatsakis
Copy link
Contributor

nikomatsakis commented May 21, 2024 via email

text/0000-stabilize-marker-freeze.md Outdated Show resolved Hide resolved
text/0000-stabilize-marker-freeze.md Show resolved Hide resolved
text/0000-stabilize-marker-freeze.md Outdated Show resolved Hide resolved
text/0000-stabilize-marker-freeze.md Outdated Show resolved Hide resolved
# Future possibilities
[future-possibilities]: #future-possibilities

- One might later consider whether `core::mem::Freeze` should be allowed to be `unsafe impl`'d like `Send` and `Sync` are, possibly allowing wrappers around interiorly mutable data to hide this interior mutability from constructs that require `Freeze` if the logic surrounding it guarantees that the interior mutability will never be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a plausible possibility. If you never use interior mutability, just don't use a type with UnsafeCell. What is the possible use-case for having an UnsafeCell that you will definitely never use for interior mutability? (That would be the only situation where unsafe impl Freeze would be sound.)

Copy link
Contributor

@joshlf joshlf May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've considered wanting this for zerocopy. The use case would be to add a wrapper type Frozen<T> (joking - no clue what we'd name it) that is Freeze even if T isn't. IIUC this would require Freeze to be a runtime property rather than a type property (ie, interior mutation never happens), and would require being able to write unsafe impl<T> Freeze for Frozen<T>.

It's not something we've put a ton of thought into, so I wouldn't necessarily advocate for it being supported right now, but IMO it's at least worth keeping this listed as a potential future direction and trying to keep the possibility open of moving to that future w/ whatever design we choose.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the possible use-case for having an UnsafeCell that you will definitely never use for interior mutability?

Speculation: Perhaps a data structure whose contents are computed using interior mutability, then put in a wrapper struct which disables all mutation operations and implements Freeze. The reason to do this using a wrapper rather than transmuting to a non-interior-mutable type would be to avoid layout mismatches due to UnsafeCell<T> not having niches that T might have.

This is only relevant if such a type also wants to support the functionality enabled by implementing Freeze, of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've considered wanting this for zerocopy. The use case would be to add a wrapper type Frozen (joking - no clue what we'd name it) that is Freeze even if T isn't.

What would that type be good for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've considered wanting this for zerocopy. The use case would be to add a wrapper type Frozen (joking - no clue what we'd name it) that is Freeze even if T isn't.

What would that type be good for?

It's been a while since we considered this design, so the details are hazy in my memory, but roughly we wanted to be able to validate that a &[u8] contained the bytes of a bit-valid &T. We wanted to encode in the type system that we'd already validated size and alignment, so we wanted a type that represented "this has the size and alignment of T and all of its bytes are initialized, but might not be a bit-valid T." To do this, we experimented with a MaybeValid<T> wrapper.

When you're only considering by-value operations, you can just do #[repr(transparent)] struct MaybeValid<T>(MaybeUninit<T>); (MaybeUninit itself isn't good enough because it doesn't promise that its bytes are initialized - the newtype lets you add internal invariants). The problem is that this doesn't work by reference - if you're using Freeze as a bound to prove that &[u8] -> &MaybeValid<T> is sound, it won't work if T: !Freeze.

As I said, we didn't end up going that route - instead of constructing a &MaybeValid<T>, we construct (roughly) a NonNull<T> (okay, actually a Ptr) and just do the operations manually. The code in question starts at this API if you're curious.

Copy link
Contributor

@joshlf joshlf May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're wrapping T inside a private field, why does it matter whether it has interior mutability or not?

Because we want to be able to use MaybeValid<T> with APIs that are safe and use trait bounds (like Freeze) to prove their soundness internally. One of our golden rules for developing zerocopy is to be extremely anal about internal abstraction boundaries [1]. While we could just do what you're suggesting using unsafe (namely, use unsafe to directly do the &[u8] -> &MaybeValid<T> cast), we have existing internal APIs such as this one [2] that permit this conversion safely (ie, the method itself is safe to call), using trait bounds to enforce soundness.

In general, we've found that, as we teach our APIs to handle more and more cases, it quickly becomes very unwieldy to program "directly" using one big unsafe block (or a sequence of unsafe blocks) in each API. Decomposition using abstractions like Ptr has been crucial to us being confident that the code we're writing is actually sound.

[1] See "Abstraction boundaries and unsafe code" in our contributing guidelines

[2] This API is a bit convoluted to follow, but basically the AliasingSafe bound bottoms out at Immutable, which is our Freeze polyfill

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid it is not clear to me how Frozen helps build up internal abstraction barriers, and your existing codebase is way too big for me to be able to extract that kind of information from it with reasonable effort.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR: We can only make this code compile if MaybeValid<T>: Freeze despite T: ?Freeze. I'm arguing that, in order to support this use case, we should design the RFC to be forwards-compatible with permitting unsafe impl<T> Freeze for MaybeValid<T> even if we don't support it out of the gate.

Here's a minification of the sort of code I'm describing. It has three components that, in reality, would live in unrelated parts of our codebase (they wouldn't be right next to each other like they are here, so we'd want to be able to reason about their behavior orthogonally):

  • MaybeValid<T>
  • try_cast_into<T>(bytes: &[u8]) -> Option<&MaybeValid<T>> where MaybeValid<T>: Freeze
    • This is responsible for checking size and alignment, but not validity; in our codebase, it is used both by FromBytes (which needs no validity check) and by TryFromBytes (which performs a validity check after casting)
    • In our codebase (but not in this example), this does some complicated math to compute pointer metadata for slice DST; it can't easily be inlined into its caller
  • TryFromBytes, with:
    • try_ref_from(bytes: &[u8]) -> Option<&Self> where Self: Freeze
    • try_mut_from(bytes: &mut [u8]) -> Option<&mut Self>

Notice the MaybeValid<T>: Freeze bound on try_cast_into (and the safety comments inside that function). That bound permits us to make the function safe to call (without that bound, we'd need a safety precondition about whether interior mutation ever happens using the argument/returned references).

try_ref_from can satisfy that bound because it already requires Self: Freeze, which means that MaybeValid<Self>: Freeze. However, try_mut_from does not require Self: Freeze. Today, the linked code fails to compile thanks to line 114 in try_mut_from:

let maybe_self = try_cast_into::<Self>(bytes)?;

What we'd like to do is be able to guarantee that interior mutation will never be exercised via &MaybeValid<T>, and thus be able to write unsafe impl<T> Freeze for MaybeValid<T>.

Copy link
Member

@RalfJung RalfJung May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice the MaybeValid: Freeze bound on try_cast_into (and the safety comments inside that function). That bound permits us to make the function safe to call (without that bound, we'd need a safety precondition about whether interior mutation ever happens using the argument/returned references).

What goes wrong if it's not Freeze? There can't even be a proof obligation attached to this since you want it to be trivially true. I think all you'd lose is some optimization potential.

If MaybeValid was always Freeze, you'd have to extend the safety comments on deref_unchecked to say "also you must never ever mutate via this shared reference, even if T is e.g. Cell<T>". An unsafe impl Freeze for MyType type has the responsibility of ensuring that its interior is never mutated through any pointer derived from an &MyType. That's what Freeze means: no mutation through shared references to this type, including all pointers ever derived from any such shared reference.

So henceforth I will assume that MaybeValid has added this requirement to its public API. You can do that without having unsafe impl Freeze, it's just a safety invariant thing. Now you can drop the Freeze side-condition on try_cast_into, since no interior mutability is exposed by exposing a MabeValid. And then your problem goes away, no?

This requires try_cast_into to rely on a property of MaybeValid, but since MaybeValid appears in try_cast_into's signature, there's no new coupling here -- you already depend on the fact that MaybeValid can be soundly constructed for invalid data.

So I still don't see a motivation for impl Freeze here, except if you somehow want more optimizations on MaybeValid.

What we'd like to do is be able to guarantee that interior mutation will never be exercised via &MaybeValid

tl;dr you can already do that, just put these requirements into the public API of MaybeValid. And you'd have to put these requirements in the public API of MaybeValid even if we allowed you to unsafe impl Freeze, so you'd not gain anything from that as far as I can see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, and I think that's a compelling argument. I can imagine there being cases where you'd want to pass MaybeValid to a generic API that accepted T: Freeze, in which case you'd still need this, but I don't have a concrete example of this off the top of my head.

I'd still advocate for keeping this listed as a future possibility since doing so doesn't have any effect on the current proposal (ie, the current proposal is already forwards-compatible with this). Obviously we'd need to adjudicate it much more thoroughly if it was ever proposed to actually go through with permitting unsafe impl Freeze, and I assume you'd push back if that ever happened just as you are now, but I think it's at least worth mentioning that it's something that we've considered and that there might be some desire for.

text/0000-stabilize-marker-freeze.md Outdated Show resolved Hide resolved
@p-avital
Copy link
Author

p-avital commented May 22, 2024

Hello again everyone,

I'm back and ready to keep pushing on this. I've already committed the result of every suggestion that seemed immediately actionable to me.

It appears to me that 2 remaining roadblocks are:

  • Should we rename Freeze? (while I'm at it, dear maintainers, how much of the RFC should use the new name if we settle on a new one?)
  • Should we provide PhantomNotFreeze in the same stride?

As an aside to maintainers around here, is there a chance we can slip this in 1.79 if we're quick enough about resolving those questions, or is the release cycle already too far gone?

@joshlf
Copy link
Contributor

joshlf commented May 22, 2024

One remaining question in my mind. I'm not sure whether this would affect the design, but it might.

Do we want to leave open the possibility of implementing Freeze conditional on how many bytes are covered by UnsafeCell, even via generis? Ie:

struct Foo<T>(UnsafeCell<T>);

static_assertions::assert_not_impl_any!(Foo<u8>: Freeze);
static_assertions::assert_impl_all!(Foo<()>: Freeze);

This would make ZST-ness (or "zestiness", as @jswrenn prefers to pronounce it) a semver-visible property, but that's already true for some types (e.g. () is guaranteed to be a ZST). Do we at least want to hold open the possibility of doing this for types whose zestiness is already guaranteed?

Obviously a lot of thorny questions to work out - I'm not talking about working them out here, but just making sure we don't foreclose on the future possibility. Maybe add it to the "future possibilities" section of the RFC.

@jswrenn
Copy link
Member

jswrenn commented May 22, 2024

@joshlf, I believe that possibility is permitted. See the reference level explanation's proposal of Freeze's documentation: https://github.com/p-avital/rfcs/blob/stabilize-marker-freeze/text/0000-stabilize-marker-freeze.md#reference-level-explanation

Freeze marks all types that do not contain any un-indirected interior mutability.
This means that their byte representation cannot change as long as a reference to them exists.

Note that T: Freeze is a shallow property: T is still allowed to contain interior mutability,
provided that it is behind an indirection (such as Box<UnsafeCell<U>>).
Notable !Freeze types are UnsafeCell and its safe wrappers
such as the types in the cell module, Mutex, and atomics.
Any type which contains any one of these without indirection is also !Freeze.

This explanation is phrased in terms of mutability-without-indirection, rather than the presence of absence of Freeze.

That said, the current implementation of Freeze is quite conservative. For instance, it [T; 0] is only Freeze if T: Freeze.

@joshlf
Copy link
Contributor

joshlf commented May 22, 2024

Yeah, so I think we'd just want to update the proposed doc comment to be more explicit about the fact that T: Freeze does not promise that T contains no UnsafeCell internally, only that it cannot be used to exercise interior mutation. That's the sort of thing that would be very easy to accidentally rely upon if we don't disclaim it loudly.

@RalfJung
Copy link
Member

RalfJung commented May 22, 2024

As an aside to maintainers around here, is there a chance we can slip this in 1.79 if we're quick enough about resolving those questions, or is the release cycle already too far gone?

1.79 is in beta and not going to receive any new features. If an implementation of this got merged before June 7th, it would be part of 1.80. But given t-lang capacity and the 10-day FCP period for the RFC, I don't want to raise any false expectations -- that's very unlikely. If it makes it into 1.81 (which branches on July 19th) that would be extremely fast by RFC standards, but we can possibly used the fact that this fixes a regression as an argument for prioritizing this in t-lang discussions. (I can't make that call, I am not on the lang team.)

Rust is moving too fast for some people's liking, but not that fast.

@p-avital
Copy link
Author

One remaining question in my mind. I'm not sure whether this would affect the design, but it might.

Do we want to leave open the possibility of implementing Freeze conditional on how many bytes are covered by UnsafeCell, even via generis? Ie:

struct Foo<T>(UnsafeCell<T>);

static_assertions::assert_not_impl_any!(Foo<u8>: Freeze);
static_assertions::assert_impl_all!(Foo<()>: Freeze);

I think it does. I think a nice way to allow for this to fit the RFC nicely would be to:

  • Highlight the RFC's definition that Freeze is about byte-fiddling, and therefore does not promise that T: Freeze doesn't guarantee that T doesn't have some ZST UnsafeCell.
  • Propose the opt-out mechanism at the same time, such that maintainers can opt out of Freeze "upon release". To me, this could come in a second time since they didn't have access to such a mechanism for previous releases either way, but I do think it's a valuable addition, even if it just starts as a wrapper around UnsafeCell<()>.

@p-avital
Copy link
Author

Rust is moving too fast for some people's liking, but not that fast.

Bunch of grumps! 🥲

If it makes it into 1.81 (which branches on July 19th) that would be extremely fast by RFC standards

So at least 2 more months of sad stabby. Imma push as hard as I can for this to happen without putting a sour taste in t-lang's mouth :)

@RalfJung
Copy link
Member

I'm afraid it's going to be more than 2 months -- 1.81 branches on July 19th, i.e. that's when the 6-week beta period starts. It is going to be released on Sep 5th.

I hope this is not too much of a downer, your help here is much appreciated! It's just generally not good idea to rush irreversible decisions that the entire Rust community will have to live with forever.

@p-avital
Copy link
Author

I hope this is not too much of a downer, your help here is much appreciated! It's just generally not good idea to rush irreversible decisions that the entire Rust community will have to live with forever.

Honestly, the downer was 1.78; now I'm just fired up to get that situation resolved!

I fully understand wanting to do due diligence on these things, I've been hurt by both ends of the stabilization stick before (looking at you, forever ABI-unstable core::task::Waker because the constructor takes extern "Rust" fns).

But just to keep a sense of urgency: not making it to 1.81 would be a downer 😛

As an aside: I've updated the proposal to include PhantomNotFreeze for semver nerds :)

@p-avital
Copy link
Author

p-avital commented May 25, 2024

Hiya @RalfJung (sorry if the ping is a bit cavalier), do you see any work this RFC still needs before it can enter FCP? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet