-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make global state domain-local in Random, Hashtbl and Filename #582
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR fixes the Stdlib issues which are pressing.
Down the line we want to have a uniform efficient way for user level DLS variables vs the special fast locations for Stdlib variables introduced in this PR. We should probably file an issue for that after we get this one done.
I had a one place where we should avoid causing an allocation in DLS.get
and minor naming/documentation nits.
(I ignored the diff coming from #579).
This enables constant-time access to any field in the domain-local state. Hence, there is no longer a need to specialise the domain-local state for local states in Stdlib. This simplifies the code massively compared to the previous design.
I've significantly simplified the PR thanks to making the domain-local state an array allowing constant-time access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed the new runtime code and the use of Random
in other parts of the stdlib (Filename.temp_file_name
and randomized hashtables), and I believe that those are correct.
However, I am not sure about the behavior we get for the Random
module itself.
For users that want self-initialization, Domain.DLS.new_key Random.State.make_self_init
is a good approach (if a bit low-level), it gives the expected behavior of an independent self-initialized seed for all domains. They can also use Random.int
and other global-state functions to the same effect, as long as they remember to call Random.self_init
at the creation of each domain. (Are they going to remember this? I'm a bit worried.)
However, for users that want to initialize their generator with a fixed seed, I don't think that the current state is very satisfying. If they don't do anything, all domains are going to be initialized with the same seed, so all domains are going to draw the same random sequence. This is pretty bad, creating unwanted correlations in the program.
What they probably want is to either:
- Set the seed of new domains by consuming random state from the initial domain (to spawn 9 new domains, draw 9 random integers in the initial domain and set the seed of each new domain by the corresponding random integer using
Random.init
). This requires non-trivial logic, I don't think that most Random users would be careful enough to do this. - Or call
Random.full_init : int array -> unit
with a seed of their choice and the domain identifier. This is simple, but I am not sure it is a good idea: if we stop a domain and start another one right after, are we going to get a "fresh" domain identifier, or the same identifier as some previous domain? (The latter would be my guess from skimming the runtime code.)
I think that a multi-domain program and a global-state random generator are fundamentally incompatible and we are never going to get a stdlib design that feels nice anyway. (In particular, we can't hope that a fixed seed will make random-generation deterministic, as long as random-using program fragments are scheduled non-deterministically on the domains.) Trying to do magic things for user convenience is not a good idea, the model must be simple so that users can easily reason about the sort of randomness they get. But I think it's worth giving some thoughts to what the basic needs are, and how to meet them correctly; they should be documented, and, ideally, on the path of least effort.
For me the basic needs are as follows:
- The user wants self-initialization for a multi-domain program. Currently this requires specific thinking when spawning new domains, but predictable and easy to achieve. This should be documented carefully in the Random documentation. I considered the idea of remembering that
self_init
has been called on the main domain and doing it implicitly on other domains, but that would be too magic. - The user wants fixed-seed initialization for the program, but different domains should draw different values. I don't know what is a simple and correct approach to do this currently.
Remark: if we had a random generator supporting splitting (for example @xavierleroy's pringo), I would expect spawning a new domain to split the random generator state between the spawning domain and the spawned domain. Notice that this does "exactly the right thing" in both the self-initialized and fixed-seed scenarios. But this requires a much tighter coupling between the Random module and the runtime, Random cannot be explained anymore as a user-side library with global state.
@@ -599,6 +599,7 @@ CAMLprim value caml_sys_random_seed (value unit) | |||
data[n++] = getpid(); | |||
data[n++] = getppid(); | |||
#endif | |||
data[n++] = (intnat)Caml_state->unique_token_root; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is caml_sys_random_seed
is not very nice (I think); in the current state you also need to add this logic to caml_win32_random_seed
in runtime/win32.c
.
Another approach would be to add your write outside the #ifdef WIN32 ... #endif
block, written as if (n < 16) data[n++] ...
. Then it would happen on both systems, which reduces duplication.
Reading this code made we want to refactor it, so maybe I'll send a PR upstream directly -- without your change.
@gasche Thanks for the detailed reviews. I have a feeling that the question of whether In order to make progress, may I suggest splitting the question of |
Yes, I think the split is very reasonable.
I strategically pinged @xavierleroy above, because I think he is the right person to ask about this. |
Great. I've updated |
Does this suggest a missing feature of domain local state? Perhaps when a new domain is created it should be able to create its new domain local states from the values of the same states in its parent domain? You could probably implement this fairly efficiently by remembering the tree structure on the domains and copying the state array of the parents when they create their children. |
I'm catching up on the discussion. Generally speaking, the usual approaches to pseudo-random number generation in multithreaded programs are
The latter gives you reproducibility (from a fixed initial state) if all the threads are created before they start generating random numbers. However, this reproducibility is lost if the state is attached to a domain and lightweight threads are then interleaved in this domain. The former sounds expensive in terms of synchronization, but with SplitMix for example you only need to update one 64-bit integer, which you could do with a CAS. @gasche already pointed to my PRINGO library (https://github.com/xavierleroy/pringo). It's a preliminary investigation of splittable PRNGs that could one day replace OCaml's Random module. |
Thanks @xavierleroy. I've opened a separate issue to keep track of multi-thread compatible PRNGs #585. |
Sounds reasonable. I've made an issue to track this: #587. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update to this PR removes my concern about 'stdlib specific slots' and gives us O(1) lookups for user keys, which is much nicer.
I had some minor nits: one place where we weren't guarding exceptions and I think we should retain the existing random behaviour in two of the testsuite changes.
I think we should document what users should expect when using Random
with domains. For example, I think the following short comment in Random.mli
might be enough:
- By default the
Random
seed is static and identical across all domains. This means that callingRandom.int N
repeatedly in two domains will give the same sequence of numbers in each domain. - If you wish each domain to have uncorrelated sequences of random numbers then one way to achieve that is to call
Random.self_init ()
on each domain before requesting random numbers.
We have #587 for parent-child DLS and #585 for splittable PRNG but I'm keen in the meantime we document what is there.
Thanks @ctk21. I believe I have addressed all of your comments. |
Yep. CI failure is unrelated and I am working on a fix for it. |
@@ -586,7 +586,7 @@ CAMLprim value caml_sys_random_seed (value unit) | |||
/* If the read from /dev/urandom fully succeeded, we now have 96 bits | |||
of good random data and can stop here. Otherwise, complement | |||
whatever we got (probably nothing) with some not-very-random data. */ | |||
if (n < 12) { | |||
if (n < 11) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: after thinking harder about this, I think that changing n < 12
to n < 11
is not the best way to do this change. (If we get exactly 11 bytes from /dev/urandom, we do want to add some non-random seeds; currently the after-PR implementation will lack a bit of entropy in this weird corner case that will probably not occur in practice.)
I proposed a refactoring of the upstream function in ocaml/ocaml#10472 that would have made it easier to correctly extend the function.
…ticore/kayceesrk/stdlib_dls Make global state domain-local in Random, Hashtbl and Filename
…ticore/kayceesrk/stdlib_dls Make global state domain-local in Random, Hashtbl and Filename
Fixes issues #564, #565, #566.
Depends on PR #579.