Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantics of bulk memory operations #225

Open
syg opened this issue May 6, 2024 · 5 comments
Open

Semantics of bulk memory operations #225

syg opened this issue May 6, 2024 · 5 comments

Comments

@syg
Copy link

syg commented May 6, 2024

The semantics of bulk memory operations on shared linear memory seems to be currently underspecified. If the memory model is mostly the same as JS's, I assume these should be unordered atomic.

AFAICT V8's current implementation is also unsafe in that it just calls out to C memmove and memcpy under the hood, which are non-atomic.

@conrad-watt should know best.

@conrad-watt
Copy link
Collaborator

conrad-watt commented May 7, 2024

Currently the specification is just unordered byte writes (it would be equivalent to have one big tearing unordered write). This is a deliberate choice which means that any racing read may independently non-deterministically see an arbitrary mixture of "before" bytes and "after" bytes, effectively blessing [almost?] any reasonable implementation as conformant (including optimised strategies involving vectors etc). Because we have a full eager bounds check on the entire range of the bulk instruction before any writes take place, this non-determinism is only visible to racing reads (analogy - C undefined behaviour), not to any correctly synchronized code, or any code that synchronously inspects the memory after an OOB trap. We did previously agree that if any future feature is able to interrupt a bulk instruction midway after the initial bounds check (e.g. dynamic page write protection), the specified result would be a memory state where each byte write of the bulk instruction may have independently non-deterministically happened or not happened - we haven't had to put this into practice yet!

It's true that calling out to a fully abstract C memmove is technically C-level UB, but so long as some basic reasoning is done about the underlying implementation (and my potentially stale understanding was that V8 controls its memmove implementation rather than delegating straight to the system), things should be fine. Delegating Wasm memory accesses to C libraries in general needs to be done with care even in single-threaded code because of different UB expectations around aliasing etc.

@syg
Copy link
Author

syg commented May 7, 2024

Currently the specification is just unordered byte writes

Thanks for clarifying.

It's true that calling out to a fully abstract C memmove is technically C-level UB, but so long as some basic reasoning is done about the underlying implementation (and my potentially stale understanding was that V8 controls its memmove implementation rather than delegating straight to the system), things should be fine. Delegating Wasm memory accesses to C libraries in general needs to be done with care even in single-threaded code because of different UB expectations around aliasing etc.

So the intended implementer guidance is "reason at the level of the ISA"?

There's currently a discrepancy between SharedArrayBuffer builtins and bulk memory operations in V8. For SAB builtins that do the moral equivalent of bulk memory operations (like set), there are manual relaxed equivalents (e.g. Relaxed_Memmove). This I think was done somewhat out of an abundance of caution but also done to quell TSAN. The Wasm equivalent just calls out to std::memmove, which gave me pause.

@conrad-watt
Copy link
Collaborator

So the intended implementer guidance is "reason at the level of the ISA"?

Yes, for maximum performance. Relaxed_Memmove would also be a conformant but potentially less ambitious way to implement bulk instructions. I would potentially be a little suspicious if std::memmove is literally going directly to the user's dynamically-linked system library - but only because it's permitted by the C/C++ spec in theory for such a library to be "maliciously conformant" - in practice it would be hard for a natural implementation to do something that violates the current Wasm spec.

@Djuffin
Copy link

Djuffin commented May 20, 2024

To me it looks like Relaxed_Memmove itself contains an UB from C++ point of view.

Does the C++ standard guaranty that atomic and T are the same thing and can be casted to each other like this?

reinterpret_cast<volatile AtomicWord*>(dst)

https://source.chromium.org/chromium/chromium/src/+/main:v8/src/base/atomicops.h;drc=728231103e6768cd9387b68b25b52e970f499065;l=415

@syg
Copy link
Author

syg commented May 20, 2024

Does the C++ standard guaranty that atomic and T are the same thing and can be casted to each other like this?

Yep, good eye. This is UB, though V8 does depend on it everywhere. I think since C++20's atomic_ref there's at least a compliant feature to transition to.

(In my mind, and this isn't a very strong argument, this is like how everyone depends on type punning.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants