-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix data race on Windows file descriptors #11304
Conversation
This defeats the purpose of caching the result of
Indeed, it might be safer and easier to put a But you can also get it right with only atomic instructions + some busy-waiting: the thread that needs to call |
34e2061
to
3ab0489
Compare
I propose such a busy-waiting implementation in the updated version. It wasn't clear to me how to implement it with a Again, I haven't been able to test this, not having a Windows system at the moment. |
I took the liberty to push directly to your branch:
I tested the code on the Mingw-64 system used for the Jenkins CI. It builds and passes the test suite. Code reviews welcome, as always. |
0e605aa
to
0f78a2c
Compare
Thank you for this. I rebased the PR on trunk after the recent merge of #10926 and ensured that our functions follow the new naming policy. Your improvements and tests look good to me. |
@dra27 @damiendoligez @sadiqj @ctk21 : could we get a quick review for this code? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay - partly because of faffing around with other things, but mainly because there was something I didn't like about this and I hadn't been able to put my finger on it. I now have - I think this would be fine, but I wonder if we can avoid the introduction of a new function?
Any uses of CRT_fd_val
as an rvalue in user code I believe start to get warnings about _Atomic
- or, if not, should we actually remove the macro, on the basis that it's no longer a safe way to access that field?
Entire review is a suggestion only - the PR could be merged as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @dra27 that it's weird to have functions and the macros. I take the opposite view on the solution though, it might be nicer to have functions to retrieve/set which deal with the atomics and eliminate the macro entirely? At the moment the atomics are strewn across every use and may be forgotten in future uses of CRT_fd_val
.
otherlibs/unix/unixsupport_win32.c
Outdated
@@ -55,7 +55,8 @@ value caml_win32_alloc_handle(HANDLE h) | |||
caml_alloc_custom(&handle_ops, sizeof(struct filedescr), 0, 1); | |||
Handle_val(res) = h; | |||
Descr_kind_val(res) = KIND_HANDLE; | |||
CRT_fd_val(res) = NO_CRT_FD; | |||
atomic_store_explicit(&CRT_fd_val(res), (intnat)NO_CRT_FD, | |||
memory_order_relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This applies to all the relaxed stores in this change)
What's the rationale behind relaxed atomics? I see at least here we're doing some non-atomic stores before which makes me wonder.
I was mistaken, I thought that for initializing writes we still needed relaxed atomics, but it appears that plain stores are fine. Regarding macros vs functions: I tend to like more the idea of removing the |
Like @OlivierNicole, I'd rather unexport the Re "non-atomic" stores, you mean syntactic assignments Re relaxed atomics, I think what we need here is sequential consistency for each |
See discussion in ocaml#11304.
I unexported the Regarding relaxed atomics, in my understanding since these are initializing writes, they will go through a fence before another thread has a chance to access the custom block, so there should be no difference between them and plain stores. |
I wasn't aware of this. In that case, I think I should revert to using |
Do you really think the performance difference matters? If not, let's keep the code as simple as possible, i.e. with SC atomics everywhere. If so, I think |
On Windows, the function `win_CRT_fd_of_filedescr` is called by several functions of the Unix module, to wit: - `Unix.in_channel_of_descr` - `Unix.out_channel_of_descr` - `Unix.fsync` - `Unix.dup2` - `Unix.isatty` This function can cause races on the `crt_fd` field of the `filedescr` structure. This removes the data race itself. However, it does not guarantee that the Windows API function `_open_osfhandle` will not be called several times on the same handle. In addition, concurrent calls to `win_CRT_fd_of_filedescr` and `Unix.close` can still lead to fd leaks.
See discussion in ocaml#11304.
For compatibility with 3rd-party code.
33c0496
to
6ad7161
Compare
Rebased to fix the conflict and restart CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me in its current form. It's likely that the SC atomics are too strong, but do not really hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go
Thanks @OlivierNicole |
Fix data race on Windows file descriptors (cherry picked from commit 663e8d2)
Cherry-picked to 5.0 along with a commit fixing the mingw-w64 build. |
On Windows, the function
win_CRT_fd_of_filedescr
is called by several functions of the Unix module, to wit:Unix.in_channel_of_descr
Unix.out_channel_of_descr
Unix.fsync
Unix.dup2
Unix.isatty
This function can cause races on the
crt_fd
field of thefiledescr
structure. This PR removes the data race itself by making the field atomic.Currently not equipped with a Windows machine, I have not been able to test these changes.
Not fixed by this PR
This PR does not change the fact that concurrent calls to to this function can call the Windows API function
_open_osfhandle
several times on the same handle (as doing so would require some form of locking). Microsoft's documentation is unclear about the possible consequences of this (no consequences, file descriptor leaks, or error).In addition, concurrent calls to
win_CRT_fd_of_filedescr
andUnix.close
can still lead to fd leaks.