Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ABI conversion #1746

Merged
merged 11 commits into from
May 11, 2022
Merged

Improve ABI conversion #1746

merged 11 commits into from
May 11, 2022

Conversation

rylev
Copy link
Contributor

@rylev rylev commented May 9, 2022

note: since codegen was run for this PR, it is best to review commit by commit

This improves ABI conversions for COM interfaces:

  • Add some docs to the Abi trait and clean it up a bit
  • Use ::core::mem::MaybeUninit::zeroed instead of ::core::mem::zeroed (more on this below)

::core::mem::zeroed is soft deprecated because it's hard to use correctly, and ::core::mem::MaybeUninit::zeroed is preferred. I think this is reason enough to make the switch, but there's more here to try to straighten out.

We set all out params to zeros initially. With this change, these out params are considered to be in an uninitialized state when they are passed to the COM method. This is strictly more correct than what we had before where we were using ManuallyDrop to control dropping. MaybeUninit also does not run the inner type's drop implementation when it is dropped.

With this change, there is an implicit agreement (which we must make explicit) that if the implementor of the COM interface returns an HRESULT which is 0 or above, the out param will be considered initialized. Note that HRESULTs of 0 or above can still be set even if the implementor returns an Err (e.g., Err(Error::OK)). Because of this, I've added an additional invariant to Abi that all implementors of Abi must be able to be in a valid state if they are all zeros or they must manually implement from_abi and do a check for all zeros and return an Err if they are. This check is exactly what T: Interface already does.

Lastly, as stated above, MaybeUninit also does not run the inner type's drop implementation when it is dropped. This means we don't need to wrap blittable types in ManuallyDrop since this behavior is now controlled by MaybeUninit. (we still need to wrap non-blittable types in ManuallyDrop because the conversion from type to ABI leads to the type being dropped across FFI boundaries).

@rylev rylev requested review from eholk and kennykerr May 9, 2022 10:59
eholk
eholk previously approved these changes May 10, 2022
Copy link
Collaborator

@eholk eholk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I had a lot of comments around which things are unsafe in the Abi trait, but I think they are mostly things to think about longer term. I'm fine with merging this as-is.

/// # Safety
///
/// This function is safe to call if `abi` can safely be trivial transmuted to `Self`
unsafe fn from_abi(abi: Self::Abi) -> Result<Self> {
Ok(core::mem::transmute_copy(&abi))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default implementation makes me kind of nervous, since transmute won't always be the right thing, making it easy to accidentally get UB. That said, I understand that it would get rather verbose and also error-prone to have to copy this everywhere it's needed. Maybe it's worth looking into making something like #[derive(Abi)] working, which might even be able to detect some common cases where the transmute version isn't appropriate.

A lot of times when I implement a trait now, I just use rust-analyzer to generate the skeleton for methods without defaults. That means I'd probably fail to do this correctly a lot of times. It helps that the trait comment calls out when we need to implement from_abi, but I wonder if we can make that more obvious in the rust-analyzer completion scenario?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always safe to convert from a valid ABI type via transmute. Keep in mind that this is only intended for the windows crate to use internally and scoped to Windows types that follow a certain ABI convention that makes this safe.

While a derive macro would solve so many problems, the derive macros are insanely slow which is why I've scrubbed all but the essentials (PartialEq and Eq required for constant patterns) out of the windows crate.

Of course, if a certain compiler dev wants to speed up the derive macro then I'm all for it. 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always safe to convert from a valid ABI type via transmute.

This isn't really true. Converting a null pointer to a T: Interface is incorrect (and thus why the implementation of Abi for T where T: Interface explicitly checks for null), so there's a good reason for there to be an explicit from_abi method.

The reverse (i.e., going from T to T::abi) is currently always done via transmute. I'd personally be in favor of adding a more explicit to_abi method though. This would make the role of the Abi trait much clear. It would make verifying safety much more straightforward, and it would eliminate some calls to std::mem::transmute which I think is a goal we should always have.

So, I agree that a derive macro probably isn't worth implementation complexity and resulting compile time slow down, but I do think it would be worth writing a little bit more code that makes all of this much more straightforward to understand.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really true.

Sorry if that was misleading. I was giving the short-hand C++ explanation. The Rust explanation is a little more nuanced only because we distinguish between nullable types and non-nullable types which we don't do in C++. You'll see this in RuntimeType::DefaultType. So in Rust, it's always safe to convert from a valid ABI type via transmute to a corresponding "default type". I do need to merge/harmonize the Abi trait and the RuntimeType trait so let's avoid overhauling the Abi trait without keeping that in mind. Happy to consider pairing to_abi and from_abi - the confusion just comes from the split between the Abi and RuntimeType but I think it will be a lot simpler to understand once those are combined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own understanding, what is RuntimeType used for now? What do you envision things looking like after merging/harmonizing Abi and RuntimeType?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuntimeType was originally required beyond just Abi because WinRT includes generic COM interfaces where certain method signatures may have different types and semantics based on the generic type parameters and if those type parameters are nullable then need to know what the default type ends up being in order to build the projection correctly. It's all rather convoluted and more so in Rust because of Option/NonNull. Then throw ABI safety with nullable types and the need for a memory layout compatible borrow type it makes sense to harmonize this, which is what I've been meaning to do for some time. So let's hold off on any major Abi trait changes until I've got that sorted out and then I can get you guys to look it over.

/// # Safety
///
/// This function is safe to call if `abi` can safely be trivial transmuted to `Self`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This safety comment seems to apply to the default given here. I think it'd be better to document what the caller of from_abi needs to do to hold this correctly, and also what implementors of this method need to uphold so the caller's assumptions are valid. Once we've established that, then I think it makes sense to say something like this comment, like "if your Self::Abi can be trivially transmuted to Self, then the default implementation can be used."

/// # Safety
///
/// * The associated type `Abi` must be safe to transfer over FFI boundaries (e.g., it must have a stable layout)
/// * `from_abi` must be implemented if there are some in-memory representations of `Abi` that are not valid representations of `Self`
#[doc(hidden)]
pub unsafe trait Abi: Sized {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this trait is actually unsafe. It's basically saying how to convert between a Rust type and a matching foreign type, so in that sense it doesn't seem any more unsafe than the From or Into traits.

The part that is unsafe (i.e. has requirements the compiler can't check) is that Self::Abi is safe to transfer across FFI boundaries. It seems like maybe another way to factor this would be something like:

unsafe trait AbiSafe {}

trait Abi {
    type Abi: AbiSafe;

    ...
}

What are the requirements for being FFI-safe? Is #[repr(C)] sufficient or necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea a lot. This also has the potential to help us address the issue of what is and isn't allowed across FFI boundaries. Right now, for example, when users declare COM interfaces, they are just expected to use FFI safe types - we don't check for them. We could, however, consider putting in type checks that any type used in a COM interface method signature implements AbiSafe which could make implementing COM interfaces harder to mess up.

This is probably outside the scope of this PR however.

What are the requirements for being FFI-safe? Is #[repr(C)] sufficient or necessary?

FFI-safe in this context is somewhat nebulous. A type that is #[repr(C)] or some other stable representation (or #[repr(transparent)] over such a representation) is necessary but not sufficient. It also needs to not have a Drop impl as we often pass such values by value over an FFI boundary but we don't want their destructors to run. For example, HSTRING has a stable representation, but it also has a Drop impl that decrements a ref count (and potentially frees memory). So HSTRING isn't really FFI-safe for our purposes, but ManuallyDrop<HSTRING> is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably outside the scope of this PR however.

Yeah, for sure!

Thanks for the explanation about what we mean by FFI-safe. If we move towards having an AbiSafe trait in the future, that'd be good to include in the documentation for that trait.

/// # Safety
///
/// This function is safe to call if `abi` can safely be trivial transmuted to `Self`
unsafe fn from_abi(abi: Self::Abi) -> Result<Self> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth asking whether from_abi is actually unsafe. The answer would be no if we could determine whether abi could be accurately represented as Self. In this case, I don't think we can. I imagine many times we'll have Self::Abi = *const (), so we have know way of knowing whether that pointer points to a Self or some other type that shares the same ABI representation. So, keeping this as unsafe seems reasonable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately this is a case where from_abi is often times safe (especially when Self::Abi == Self, but not always (as you've pointed out in the case of Abi being a pointer - and especially a void pointer).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should always be safe once we harmonize Abi and RuntimeType.

if abi.is_null() {
Err(Error::OK)
} else {
// TODO: shouldn't `AddRef` be called here?
Ok(core::mem::transmute_copy(&abi))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use transmute_copy rather than transmute here? It seems like we are consuming the old value anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABI type is always blittable (e.g. no Drop) so that should be fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case both transmute and transmute_copy end up doing the same thing, so either one is fine. I personally think transmute is better stylistically because we do not need or even want to use abi after transmuting it.

crates/libs/windows/src/core/abi.rs Outdated Show resolved Hide resolved
crates/libs/windows/src/core/abi.rs Show resolved Hide resolved
@rylev rylev merged commit 8281cc9 into microsoft:master May 11, 2022
@rylev rylev deleted the abi-docs2 branch May 11, 2022 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants