Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random library: add functions bits32, bits64, nativebits #10526

Merged
merged 1 commit into from
Jul 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions Changes
Expand Up @@ -20,6 +20,9 @@ Working version
are now provided by the camlp-streams library.
(Xavier Leroy, review by Nicolás Ojeda Bär)

- #10526: add Random.bits32, Random.bits64, Random.nativebits
(Xavier Leroy, review by Gabriel Scherer and François Bobot)

### Other libraries:

- #10192: Add support for Unix domain sockets on Windows and use them
Expand Down
19 changes: 19 additions & 0 deletions stdlib/random.ml
Expand Up @@ -176,6 +176,22 @@ module State = struct

let bool s = (bits s land 1 = 0)

let bits32 s =
let b1 = Int32.(shift_right_logical (of_int (bits s)) 14) in (* 16 bits *)
let b2 = Int32.(shift_right_logical (of_int (bits s)) 14) in (* 16 bits *)
Int32.(logor b1 (shift_left b2 16))

let bits64 s =
let b1 = Int64.(shift_right_logical (of_int (bits s)) 9) in (* 21 bits *)
let b2 = Int64.(shift_right_logical (of_int (bits s)) 9) in (* 21 bits *)
let b3 = Int64.(shift_right_logical (of_int (bits s)) 8) in (* 22 bits *)
Int64.(logor b1 (logor (shift_left b2 21) (shift_left b3 42)))
Copy link
Member

@gasche gasche Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: other splits are possible (for example 30, 30, 4), and in particular using 30 bits could avoid some (probably neglectible) shift-right-logical operations. The current split uses as few "lower bits" from bits as possible. Is there a particular reason for this choice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a rumor that among the 30 random bits returned by bits (), the high bits are "more random" than the low bits. I don't have my TAOCP vol 2 with me to check. But that's why the top 21/22 bits are used here: if you draw three sets of 30 random bits, use the "more random" bits of each set.

Copy link
Contributor

@nojb nojb Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, I looked it up:

image

In the book the analysis is done for a standard linear congruence generator, but the same argument applies to other generators using a power-of-two modulus, such as 2^30 in the Fibonacci generator used in Random.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (curval lxor ((curval lsr 25) land 0x1F)) in Random seems to take some high-order bits to the low-order bits. Does that counter in part this problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (curval lxor ((curval lsr 25) land 0x1F)) in Random seems to take some high-order bits to the low-order bits. Does that counter in part this problem?

Very likely. But that's @damiendoligez 's wizardry, so I'll let him comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see the comment at top of stdlib/random.ml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (curval lxor ((curval lsr 25) land 0x1F)) in Random seems to take some high-order bits to the low-order bits. Does that counter in part this problem?

I'm not Knuth and I didn't try to do the theory, but experimenting with DieHarder seemed to show that this is indeed better. IIRC one or two of DieHarder tests did fail on sequences of the one or two low-order bits of the generator before this patch. Even with the patch, I approve of favoring the high-order bits.


Copy link
Contributor

@bobot bobot Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to just shift each part to the right place once and xor everything? The entropy is not lost, same it seems for the uniformity, even if the bits overlaps. But it is perhaps like for floating point numbers "better to do the simple thing than the faster wrong thing"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same explanation as above (#10526 (comment)): I trust the high bits more than the low bits. Maybe it's just superstition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this bit of wisdom as comment? Thanks for the explanation!

let nativebits =
if Nativeint.size = 32
then fun s -> Nativeint.of_int32 (bits32 s)
else fun s -> Int64.to_nativeint (bits64 s)

end

(* This is the state you get with [init 27182818] and then applying
Expand Down Expand Up @@ -204,6 +220,9 @@ let nativeint bound = State.nativeint default bound
let int64 bound = State.int64 default bound
let float scale = State.float default scale
let bool () = State.bool default
let bits32 () = State.bits32 default
let bits64 () = State.bits64 default
let nativebits () = State.nativebits default

let full_init seed = State.full_init default seed
let init seed = State.full_init default [| seed |]
Expand Down
18 changes: 18 additions & 0 deletions stdlib/random.mli
Expand Up @@ -74,6 +74,21 @@ val float : float -> float
val bool : unit -> bool
(** [Random.bool ()] returns [true] or [false] with probability 0.5 each. *)

val bits32 : unit -> Int32.t
(** [Random.bits32 ()] returns 32 random bits as an integer between
{!Int32.min_int} and {!Int32.max_int}.
@since 4.14.0 *)

val bits64 : unit -> Int64.t
(** [Random.bits64 ()] returns 64 random bits as an integer between
{!Int64.min_int} and {!Int64.max_int}.
@since 4.14.0 *)

val nativebits : unit -> Nativeint.t
(** [Random.nativebits ()] returns 32 or 64 random bits (depending on
the bit width of the platform) as an integer between
{!Nativeint.min_int} and {!Nativeint.max_int}.
@since 4.14.0 *)

(** {1 Advanced functions} *)

Expand Down Expand Up @@ -106,6 +121,9 @@ module State : sig
val int64 : t -> Int64.t -> Int64.t
val float : t -> float -> float
val bool : t -> bool
val bits32 : t -> Int32.t
val bits64 : t -> Int64.t
val nativebits : t -> Nativeint.t
(** These functions are the same as the basic functions, except that they
use (and update) the given PRNG state instead of the default one.
*)
Expand Down
10 changes: 10 additions & 0 deletions testsuite/tests/lib-random/chi2.ml
Expand Up @@ -48,13 +48,23 @@ let _ =
(fun () -> int_of_float (Random.float 1.0 *. 256.0));
test "Random.float 1.0 (next 8 bits)"
(fun () -> int_of_float (Random.float 1.0 *. 65536.0));
test "Random.bits32 (bits 0-7)"
(fun () -> Int32.to_int (Random.bits32()));
test "Random.bits32 (bits 20-27)"
(fun () -> Int32.(to_int (shift_right (Random.bits32()) 20)));
test "Random.int32 2^30 (bits 0-7)"
(fun () -> Int32.to_int (Random.int32 0x40000000l));
test "Random.int32 2^30 (bits 20-27)"
(fun () -> Int32.(to_int (shift_right (Random.int32 0x40000000l) 20)));
test "Random.int32 (256 * p) / p"
(let p = 7048673l in
fun () -> Int32.(to_int (div (Random.int32 (mul 256l p)) p)));
test "Random.bits64 (bits 0-7)"
(fun () -> Int64.to_int (Random.bits64()));
test "Random.bits64 (bits 30-37)"
(fun () -> Int64.(to_int (shift_right (Random.bits64()) 30)));
test "Random.bits64 (bits 52-59)"
(fun () -> Int64.(to_int (shift_right (Random.bits64()) 52)));
test "Random.int64 2^60 (bits 0-7)"
(fun () -> Int64.to_int (Random.int64 0x1000000000000000L));
test "Random.int64 2^60 (bits 30-37)"
Expand Down