-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random library: add functions bits32, bits64, nativebits #10526
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -176,6 +176,22 @@ module State = struct | |
|
||
let bool s = (bits s land 1 = 0) | ||
|
||
let bits32 s = | ||
let b1 = Int32.(shift_right_logical (of_int (bits s)) 14) in (* 16 bits *) | ||
let b2 = Int32.(shift_right_logical (of_int (bits s)) 14) in (* 16 bits *) | ||
Int32.(logor b1 (shift_left b2 16)) | ||
|
||
let bits64 s = | ||
let b1 = Int64.(shift_right_logical (of_int (bits s)) 9) in (* 21 bits *) | ||
let b2 = Int64.(shift_right_logical (of_int (bits s)) 9) in (* 21 bits *) | ||
let b3 = Int64.(shift_right_logical (of_int (bits s)) 8) in (* 22 bits *) | ||
Int64.(logor b1 (logor (shift_left b2 21) (shift_left b3 42))) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to just shift each part to the right place once and xor everything? The entropy is not lost, same it seems for the uniformity, even if the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same explanation as above (#10526 (comment)): I trust the high bits more than the low bits. Maybe it's just superstition. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add this bit of wisdom as comment? Thanks for the explanation! |
||
let nativebits = | ||
if Nativeint.size = 32 | ||
then fun s -> Nativeint.of_int32 (bits32 s) | ||
else fun s -> Int64.to_nativeint (bits64 s) | ||
|
||
end | ||
|
||
(* This is the state you get with [init 27182818] and then applying | ||
|
@@ -204,6 +220,9 @@ let nativeint bound = State.nativeint default bound | |
let int64 bound = State.int64 default bound | ||
let float scale = State.float default scale | ||
let bool () = State.bool default | ||
let bits32 () = State.bits32 default | ||
let bits64 () = State.bits64 default | ||
let nativebits () = State.nativebits default | ||
|
||
let full_init seed = State.full_init default seed | ||
let init seed = State.full_init default [| seed |] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: other splits are possible (for example 30, 30, 4), and in particular using 30 bits could avoid some (probably neglectible) shift-right-logical operations. The current split uses as few "lower bits" from
bits
as possible. Is there a particular reason for this choice?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a rumor that among the 30 random bits returned by
bits ()
, the high bits are "more random" than the low bits. I don't have my TAOCP vol 2 with me to check. But that's why the top 21/22 bits are used here: if you draw three sets of 30 random bits, use the "more random" bits of each set.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, I looked it up:
In the book the analysis is done for a standard linear congruence generator, but the same argument applies to other generators using a power-of-two modulus, such as
2^30
in the Fibonacci generator used inRandom
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
(curval lxor ((curval lsr 25) land 0x1F))
inRandom
seems to take some high-order bits to the low-order bits. Does that counter in part this problem?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very likely. But that's @damiendoligez 's wizardry, so I'll let him comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also see the comment at top of
stdlib/random.ml
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not Knuth and I didn't try to do the theory, but experimenting with DieHarder seemed to show that this is indeed better. IIRC one or two of DieHarder tests did fail on sequences of the one or two low-order bits of the generator before this patch. Even with the patch, I approve of favoring the high-order bits.