Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about non-atomic instructions like i32.store #197

Open
yamt opened this issue Mar 22, 2023 · 8 comments
Open

Questions about non-atomic instructions like i32.store #197

yamt opened this issue Mar 22, 2023 · 8 comments

Comments

@yamt
Copy link

yamt commented Mar 22, 2023

when i32.store writes data to a shared memory, it's performed with an wr action. (https://webassembly.github.io/threads/core/exec/instructions.html#t-mathsf-xref-syntax-instructions-syntax-instr-memory-mathsf-store-n-xref-syntax-instructions-syntax-memarg-mathit-memarg)

those events are atomically performed according to https://webassembly.github.io/threads/core/exec/runtime.html#events.

thus, if a runtime implements atomic instructions like i32.atomic.rmw.cmpxchg via a lock, non-atomic instructions like i32.store should take the lock too, at least when operating on a shared memory.
is it the correct reading of the spec?

background:
some applications (eg. musl) implements a mutex with atomic cmpxchg for lock and ordinary store + barrier for unlock. as far as i know, it's fine for eg. x86. however, a naive porting to wasm might or might not cause problems.

@conrad-watt
Copy link
Collaborator

Sorry, the use of the word "atomic" in that link is probably misleading and I'll make a note to change it.

The intention is that, if the atomic operations (in the sense of X.atomic.XXX) are implemented using locks, non-atomic operations can still be implemented without using locks, so long as aligned accesses don't tear (which is true on most "regular" architectures, but may not be true in some special environments).

@yamt
Copy link
Author

yamt commented Mar 22, 2023

Sorry, the use of the word "atomic" in that link is probably misleading and I'll make a note to change it.

The intention is that, if the atomic operations (in the sense of X.atomic.XXX) are implemented using locks, non-atomic operations can still be implemented without using locks, so long as aligned accesses don't tear (which is true on most "regular" architectures, but may not be true in some special environments).

ok.
then, in wasm, the way to implement a mutex mentioned in "background" in the description is not expected to work?

@conrad-watt
Copy link
Collaborator

conrad-watt commented Mar 22, 2023

Are you referring to this one here? At a rough glance it seems to me that it's correct. It would be incorrect if any of the accesses involved in its implementation were non-atomic.

Note that, if atomics are implemented using locks, any wait and notify instructions would also need to acquire those locks.

@yamt
Copy link
Author

yamt commented Mar 22, 2023

Are you referring to this one here? At a rough glance it seems to me that it's correct. It would be incorrect if any of the accesses involved in its implementation were non-atomic.

no. i meant the one in the description of this issue:

background:
some applications (eg. musl) implements a mutex with atomic cmpxchg for lock and ordinary store + barrier for unlock. as far as i know, it's fine for eg. x86. however, a naive porting to wasm might or might not cause problems.

Note that, if atomics are implemented using locks, any wait and notify instructions would also need to acquire those locks.

sure.

@conrad-watt
Copy link
Collaborator

conrad-watt commented Mar 22, 2023

some applications (eg. musl) implements a mutex with atomic cmpxchg for lock and ordinary store + barrier for unlock. as far as i know, it's fine for eg. x86. however, a naive porting to wasm might or might not cause problems.

Ah, I'm sorry for looking in the wrong place. Yes, this wouldn't work in Wasm. Wasm non-atomics and fences should be thought of as more C/C++-style than x86-style, and it wouldn't be safe in C/C++ to implement a mutex unlock with non-atomic store + fence.

Just an additional quick note, since I've been reading the conversations on the other thread (WebAssembly/wasi-libc#403)

but i noticed that it's actually more about the cmpxchg implementation than i32.store.
when cmpxchg is implemented with a lock as it is in wamr interpreter, i32.store can be executed in the middle of cmpxchg and effectively break the lock as you say.

It is our intention to allow this behaviour in Wasm. If a non-atomic store races with an atomic rmw/cas, it's not guaranteed that the rmw/cas will observably "act" like an atomic swap (C/C++ analogy - this is a data race with undefined behaviour). The correct thing to do is perform an atomic store instead of a non-atomic store (as I think you already observed)

@yamt
Copy link
Author

yamt commented Mar 22, 2023

some applications (eg. musl) implements a mutex with atomic cmpxchg for lock and ordinary store + barrier for unlock. as far as i know, it's fine for eg. x86. however, a naive porting to wasm might or might not cause problems.

Ah, I'm sorry for looking in the wrong place. Yes, this wouldn't work in Wasm. Wasm non-atomics and fences should be thought of as more C/C++-style than x86-style, and it wouldn't be safe in C/C++ to implement a mutex unlock with non-atomic store + fence.

ok.

(a bit off-topic: while i'm not familiar with C/C++ style atomics, i suspect it compiles to x86 atomics on x86 and thus has basically compatible semantics, doesn't it?)

Just an additional quick note, since I've been reading the conversations on the other thread (WebAssembly/wasi-libc#403)

but i noticed that it's actually more about the cmpxchg implementation than i32.store.
when cmpxchg is implemented with a lock as it is in wamr interpreter, i32.store can be executed in the middle of cmpxchg and effectively break the lock as you say.

It is our intention to allow this behaviour in Wasm. If a non-atomic store races with an atomic rmw/cas, it's not guaranteed that the rmw/cas will observably "act" like an atomic swap (C/C++ analogy - this is a data race with undefined behaviour). The correct thing to do is perform an atomic store instead of a non-atomic store (as I think you already observed)

ok. i got the intention.
the current wording in the spec is very misleading as it's somehow clearly stating that the only differences between atomic and non-atomic ops are memory ordering and alignment check:
https://webassembly.github.io/threads/core/exec/instructions.html#exec-atomic-store

@conrad-watt
Copy link
Collaborator

(a bit off-topic: while i'm not familiar with C/C++ style atomics, i suspect it compiles to x86 atomics on x86 and thus has basically compatible semantics, doesn't it?)

C/C++ atomics (and Wasm atomics!) need to abstract over a bunch of different possible implementations, including ones using locks, hence the "weaker" guarantees at the spec level. If one knows that one's C/C++ is definitely compiling to x86, one can try to make additional assumptions based on this, but this is hazardous because of possible compiler optimisations etc.

the current wording in the spec is very misleading as it's somehow clearly stating that the only differences between atomic and non-atomic ops are memory ordering and alignment check:

Iterating on the spec is definitely on my immediate radar (and will be a requirement for us to get this proposal over the line into the W3C standard). Note though that the differences we've been discussing above are captured purely by the difference in memory ordering, so this part of the spec wouldn't change too much - the implications of different memory orders should be explained in more detail in this section, once it's finished.

Just to try and point out other resources, the EMCAScript memory model is essentially identical for the language fragment we've been talking about. I don't know how helpful it is to read though.

@yamt
Copy link
Author

yamt commented Mar 22, 2023

(a bit off-topic: while i'm not familiar with C/C++ style atomics, i suspect it compiles to x86 atomics on x86 and thus has basically compatible semantics, doesn't it?)

C/C++ atomics (and Wasm atomics!) need to abstract over a bunch of different possible implementations, including ones using locks, hence the "weaker" guarantees at the spec level. If one knows that one's C/C++ is definitely compiling to x86, one can try to make additional assumptions based on this, but this is hazardous because of possible compiler optimisations etc.

ok.

the current wording in the spec is very misleading as it's somehow clearly stating that the only differences between atomic and non-atomic ops are memory ordering and alignment check:

Iterating on the spec is definitely on my immediate radar (and will be a requirement for us to get this proposal over the line into the W3C standard). Note though that the differences we've been discussing above are captured purely by the difference in memory ordering, so this part of the spec wouldn't change too much - the implications of different memory orders should be explained in more detail in this section, once it's finished.

i usually consider that the interlock behavior is a separate topic from memory ordering. but ok.

Just to try and point out other resources, the EMCAScript memory model is essentially identical for the language fragment we've been talking about. I don't know how helpful it is to read though.

thank you for the link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants