Does the wasm memory model have tear-free non-atomic loads? #199

sunfishcode · 2023-05-17T18:30:44Z

We recently learned that Musl (used in both wasi-libc and Emscripten) is using non-atomic 32-bit aligned loads from memory locations which can be stored to by other threads without synchronization, apparently on the assumption that compilers will always codegen these to a single instruction, effectively intending something like a very relaxed ordering. On Wasm, LLVM compiles these to plain i32.load wasm instructions.

Are there any guarantees in the proposed Wasm memory model that a plain aligned i32.load of memory that can be simultaneously stored to by aligned i32 stores in other threads won't tear?

The text was updated successfully, but these errors were encountered:

conrad-watt · 2023-05-17T18:42:03Z

Are there any guarantees in the proposed Wasm memory model that a plain aligned i32.load of memory that can be simultaneously stored to by aligned i32 stores in other threads won't tear?

Yes, naturally aligned i32 loads and stores won't tear. I sketched out this table a while ago, which should still be good.

dschuff · 2023-05-17T18:48:33Z

Relatedly, are non-atomic stores guaranteed to be observed by "later" non-atomic loads on another thread? (or by atomic loads?)

conrad-watt · 2023-05-17T19:15:16Z

It depends what you mean by "later". There is a concept of happens-before in the specification. If a non-atomic store (transitively) happens-before a non-atomic load it is guaranteed to be observed.

All operations taking place sequentially in the same thread are happens-before each other (so the single thread semantics works as you expect). In addition, synchronizing atomic_sc store-load pairs create a cross-thread happens-before relation between the two accesses, as do a few other atomic operations such as wait/notify.

So if you have a non-atomic store to x in one thread, followed by an atomic store to y in the same thread, and in a second thread do an atomic load of y which observes the value stored in the first thread, all subsequent loads of x in the second thread are guaranteed to observe the first-thread's non-atomic store to x (intuition - you know the load in the second thread is happening "later").

Thread 1                                                     Thread 2
(a) store x
(b) store_sc y  ---- reads from, creating happens-before---> (c) load_sc y
                                                             (d) load x      // observes the first thread's store

(a) happens-before (b) happens-before (c) happens-before (d) in this example

However, if all the atomic_sc accesses in the above example are changed to non-atomics, this is not guaranteed (in the model or in real hardware).

tlively · 2023-05-17T19:50:49Z

IIUC, if there are no atomics forming that happens-before relationship across threads A and B, then a non-atomic store on thread A might in principle never be observed on thread B, even if it's sitting in a loop doing non-atomic loads of the same address.

conrad-watt · 2023-05-17T20:01:41Z

Yes, that's right.

EDIT: I originally wrote the below, but in light of the point about hoisting at the bottom, I don't think it's advisable to use this code 🙃

It can still be useful to (very carefully) do a loop of non-atomic loads in certain circumstances, so long as the read value is then "confirmed" using an atomic load or CAS. For example, the following isn't an unreasonable spin-lock implementation (assuming Wasm semantics for non-atomics and CAS - in C11 you need a relaxed atomic load instead of a non-atomic load to avoid UB).

while (true) {
  lock_status = non_atomic_load
  if (lock_status = held by someone else) {
    continue
  } else {
    try to acquire the lock with a CAS
    break if successful
  }
}

Technically the memory model says that it's never guaranteed that the non_atomic_load will ever observe another thread unlocking the lock, but if it does observe the unlocked state, the subsequent CAS makes everything synchronise, so the only gotcha is that the memory model allows the possibility of an infinite loop.

In practice the infinite loop never happens with any reasonable compilation scheme, and if you know the loop has terminated, then you're ok even from the theoretical memory model point of view, since the CAS synchronised everything up.

Actually, thinking things through, it might be permissible for a Wasm compiler to hoist the non_atomic_load out of the loop, and then you'd be screwed :P

dschuff · 2023-05-17T22:45:26Z

IIUC, if there are no atomics forming that happens-before relationship across threads A and B, then a non-atomic store on thread A might in principle never be observed on thread B, even if it's sitting in a loop doing non-atomic loads of the same address.

Yeah, this is basically what I was getting at. Non-atomics never make happens-before relationships with anything on other threads (other than indirectly via a program order relationship with an atomic on the same thread).

All that is to say, maybe we should add some cheaper atomics (acq/rel and maybe relaxed?) to wasm soon :)

conrad-watt · 2023-05-17T23:37:09Z

All that is to say, maybe we should add some cheaper atomics (acq/rel and maybe relaxed?) to wasm soon :)

My personal vision is acq/rel, and then a "load-store-ordered" atomic which is marginally stronger than relaxed but avoids a lot of the mess associated with it (probably implemented using the "bogus conditional branch" strategy benchmarked in sections 4.22 and 5.2 of this paper).

I have to say though, currently I'm pushing hard to get current threads phase 4 landed, so I'd rather not spend too much time thinking about extensions before this is finished.

wingo mentioned this issue Jul 7, 2023

Multibyte access to (array i8) WebAssembly/gc#395

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the wasm memory model have tear-free non-atomic loads? #199

Does the wasm memory model have tear-free non-atomic loads? #199

sunfishcode commented May 17, 2023

conrad-watt commented May 17, 2023 •

edited

dschuff commented May 17, 2023

conrad-watt commented May 17, 2023 •

edited

tlively commented May 17, 2023

conrad-watt commented May 17, 2023 •

edited

dschuff commented May 17, 2023 •

edited

conrad-watt commented May 17, 2023 •

edited

Does the wasm memory model have tear-free non-atomic loads? #199

Does the wasm memory model have tear-free non-atomic loads? #199

Comments

sunfishcode commented May 17, 2023

conrad-watt commented May 17, 2023 • edited

dschuff commented May 17, 2023

conrad-watt commented May 17, 2023 • edited

tlively commented May 17, 2023

conrad-watt commented May 17, 2023 • edited

dschuff commented May 17, 2023 • edited

conrad-watt commented May 17, 2023 • edited

conrad-watt commented May 17, 2023 •

edited

conrad-watt commented May 17, 2023 •

edited

conrad-watt commented May 17, 2023 •

edited

dschuff commented May 17, 2023 •

edited

conrad-watt commented May 17, 2023 •

edited