New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Get full precision for 32 bit floating point random values. #20314
Conversation
The formula to convert a 32 bit random integer to a random float32, (next_uint32(bitgen_state) >> 9) * (1.0f / 8388608.0f) shifts by one bit too many, resulting in uniform float32 samples always having a 0 in the least significant bit. The formula is corrected to (next_uint32(bitgen_state) >> 8) * (1.0f / 16777216.0f) Occurrences of the incorrect formula in numpy/random/tests/test_direct.py were also corrected. Closes numpygh-17478.
d754442
to
4b9e569
Compare
Change is safe. Whether it requires a release not only matters if you think the ls bit in a 32 bit float is worth one. Probably best to be safe. |
I added a release note about the change. |
Here's a script that demonstrates the changes in the variates that can occur, and shows why a release note is warranted. Script to print random samplesimport numpy as np
print(f"numpy version {np.__version__}")
print()
seed = 98765432109
print(f"seed: {seed}")
print()
print("rng.random")
rng = np.random.default_rng(seed)
x = rng.random(12, dtype=np.float32)
print("first 12 samples:")
print(x)
print()
n = 500_000_000
print("rng.standard_exponential")
rng = np.random.default_rng(seed)
x = rng.standard_exponential(size=n, dtype=np.float32)
x1 = x[-1]
x = rng.standard_exponential(size=n, dtype=np.float32)
x2 = x[-1]
print(f"last sample of {n:10d}:", x1)
print(f"last sample of {2*n:10d}:", x2)
print()
k = 2.5
print(f"rng.standard_gamma, k={k}")
rng = np.random.default_rng(seed)
x = rng.standard_gamma(2.5, size=n, dtype=np.float32)
print("first 14 samples:")
print(x[:14])
print(f"last sample of {n}:", x[-1]) The output for the current main development branch:
The output for this pull request:
You can see the small variation in the ULP of the output of The outputs for |
Thanks @WarrenWeckesser! As far as I see, this only affects the new API, since this is However, since the streams do change I am removing the backport candidate label. Please just re-add if you disagree! |
This seems to be nearly an enhancement, even though is also a bug. While the intent was clearly to provide the maximum number of random bits, the nature of the bug only resulted in slightly less random values in most plausible scenarios. |
The formula to convert a 32 bit random integer to a random float32, (((rng)->next_uint32((rng)->state) >> 9) * (1.0f / 8388608.0f)) shifts by one bit too many, resulting in uniform float32 samples always having a 0 in the least significant bit. The formula is corrected to (((rng)->next_uint32((rng)->state) >> 8) * (1.0f / 16777216.0f)) See numpy/numpy#20314 for more details.
The formula to convert a 32 bit random integer to a random float32, (((rng)->next_uint32((rng)->state) >> 9) * (1.0f / 8388608.0f)) shifts by one bit too many, resulting in uniform float32 samples always having a 0 in the least significant bit. The formula is corrected to (((rng)->next_uint32((rng)->state) >> 8) * (1.0f / 16777216.0f)) See numpy/numpy#20314 for more details.
The formula to convert a 32 bit random integer to a random float32,
shifts by one bit too many, resulting in uniform float32 samples always
having a 0 in the least significant bit. The formula is corrected to
Occurrences of the incorrect formula in numpy/random/tests/test_direct.py
were also corrected.
Closes gh-17478.