DOC: (Undefined) behavior of float to integer casts only really explained in release notes #22852

FirefoxMetzger · 2022-12-21T10:19:52Z

Describe the issue:

When running the example test case below the assertion works fine on Windows, but fails on Linux with the message below.

This is a cross-post from: https://stackoverflow.com/questions/74872407/why-does-numpy-handle-overflows-inconsistently

Related PRs: #21437 , #21123

Reproduce the code example:

import numpy as np

print(f"Numpy Version: {np.__version__}")

# version A
versionA = np.eye(2)
versionA[0, 0] = 2**32 + 1
versionA = versionA.astype(np.uint32)

# version B
versionB = np.eye(2, dtype=np.uint32)
versionB[0, 0] = np.asarray(2**32 + 1)

np.testing.assert_array_equal(versionA, versionB)

Error message:

Numpy Version: 1.24.0
foo.py:8: RuntimeWarning: invalid value encountered in cast
  versionA = versionA.astype(np.uint32)
Traceback (most recent call last):
  File "foo.py", line 14, in <module>
    np.testing.assert_array_equal(versionA, versionB)
  File "/home/firefoxmetzger/.local/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 983, in assert_array_equal
    assert_array_compare(operator.__eq__, x, y, err_msg=err_msg,
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/home/firefoxmetzger/.local/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 862, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not equal

Mismatched elements: 1 / 4 (25%)
Max absolute difference: 1
Max relative difference: 1.
 x: array([[0, 0],
       [0, 1]], dtype=uint32)
 y: array([[1, 0],
       [0, 1]], dtype=uint32)

NumPy/Python version information:

1.24.0 3.8.10 (default, Jun 2 2021, 10:49:15)
[GCC 9.4.0]

Context for the issue:

I've tried to reduce the example further, but any further modifications change the behavior to make it work as expected. For example, changing from np.eye(2) to np.empty((1, 1)) or np.eye(1) changes the behavior on linux and the test passes.

The text was updated successfully, but these errors were encountered:

seberg · 2022-12-21T11:39:05Z

C defines this as undefined behavior. NumPy has never guaranteed any specific overflow behavior here. As C is undefined, CPUs are pretty random about what they do (plus compilers). For example for uint32 non-vectorized the compiler may spit out instructions casting to int64 (rather than uint32 directly) which then is "well defined".

The linked PRs, one adds the warning which tells you are getting undefined behavior (but in one case it is implicitly well defined after all, so no warning). The other PR would just make the situation even worse probably, by using more vectorization leading to hitting more UB.

I am not really convinced that we should worry about doing better than C compilers here. Is there a particular use-case?

FirefoxMetzger · 2022-12-22T11:02:29Z

My scenario is that I am writing a test suite for some linear algebra routines with the signature fn(ArrayLike, *, dtype=None).

This test suite uses hypothesis to generate input arrays (dtype=float) and a dtype for the output. I use this to test the function and then compare the output against a (easy, but slow) reference implementation. Currently this fails because it encounters the inconsistent behavior reported here. I can work around this by only generating arrays with a compatible datatype though (which users likely won't but we can add documentation to mitigate).

A nice addition to numpy regarding this would be a place in the docs where I could discover that this platform-specific behavior exists. The warning on linux in v1.24 already goes a long way, but maybe it could also show up on Windows (the behavior is still undefined, even if the actual behavior turns out to be consistent). Perhaps we could also provide some more detail on this in the docs and add a link to further information into the warning?

FirefoxMetzger · 2022-12-22T12:18:47Z

I also just found a smaller example that reproduces this behavior:

>>> import numpy as np
>>> foo = 2 ** 32 + 1
>>> bar = np.asarray(foo, dtype=float)
>>> np.asarray((foo, foo), dtype="i1")
<stdin>:1: DeprecationWarning: NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays.  The conversion of 4294967297 to int8 will fail in the future.
For the old behavior, usually:
    np.array(value).astype(dtype)`
will give the desired result (the cast overflows).
array([1, 1], dtype=int8)
>>> np.asarray((bar, bar), dtype="i1")
<stdin>:1: RuntimeWarning: invalid value encountered in cast
array([0, 0], dtype=int8)

Interestingly, this appears to be sensitive to the value of foo, i.e., a smaller overflow is handled consistently on the machine I am using right now (ChromeOS with an Intel Core m3):

>>> foo = 256 + 1
>>> bar = np.asarray(foo, dtype=float)
>>> np.asarray((foo, foo), dtype="i1")
<stdin>:1: DeprecationWarning: NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays.  The conversion of 257 to int8 will fail in the future.
For the old behavior, usually:
    np.array(value).astype(dtype)`
will give the desired result (the cast overflows).
array([1, 1], dtype=int8)
>>> np.asarray((bar, bar), dtype="i1")
array([1, 1], dtype=int8)

seberg · 2022-12-22T15:43:37Z

Right, the docs could probably mention this somewhere (not quite sure where). Runtime-wise, I doubt there is an easy win available, the new warning is probably about as good as it gets. (Not that I would mind making it more well defined if it isn't a big performance issue, and that may well be possible.)

FirefoxMetzger · 2022-12-22T18:17:30Z

Hm, the first place I thought off was inside the docs for astype. Seems like the main place where you might bump into this issue.

FirefoxMetzger added the 00 - Bug label Dec 21, 2022

seberg removed the 00 - Bug label Dec 21, 2022

seberg added the 04 - Documentation label Dec 22, 2022

seberg changed the title ~~BUG: Inconsistend overflow behavior when casting float64 to uint32 on Linux~~ DOC: (Undefined) behavior of float to integer casts only really explained in release notes Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: (Undefined) behavior of float to integer casts only really explained in release notes #22852

DOC: (Undefined) behavior of float to integer casts only really explained in release notes #22852

FirefoxMetzger commented Dec 21, 2022

seberg commented Dec 21, 2022

FirefoxMetzger commented Dec 22, 2022

FirefoxMetzger commented Dec 22, 2022

seberg commented Dec 22, 2022

FirefoxMetzger commented Dec 22, 2022

DOC: (Undefined) behavior of float to integer casts only really explained in release notes #22852

DOC: (Undefined) behavior of float to integer casts only really explained in release notes #22852

Comments

FirefoxMetzger commented Dec 21, 2022

Describe the issue:

Reproduce the code example:

Error message:

NumPy/Python version information:

Context for the issue:

seberg commented Dec 21, 2022

FirefoxMetzger commented Dec 22, 2022

FirefoxMetzger commented Dec 22, 2022

seberg commented Dec 22, 2022

FirefoxMetzger commented Dec 22, 2022