BUG: Conversion of numpy.nan to int gives inconsistent results #21166

Slagt · 2022-03-07T19:22:17Z

Describe the issue:

When converting a numpy array containing numpy.nan to type int, the numpy.nan are replaced by either -9223372036854775808 or 0 depending on the computer.

numpy.nan is replaced by -9223372036854775808 on a Mac Pro (2019).
numpy.nan is replaced by 0 on a MacBook Air (M1, 2020).
Computers have the same version of macOS (12.2.1), Python (3.9.10) and numpy (1.22.2).

Expected output

Both computers should behave the same. Either throw an error, or return the same value.

Reproduce the code example:

import numpy
print(numpy.array(numpy.nan).astype(int))

Error message:

No response

NumPy/Python version information:

1.22.2 3.9.10 (main, Jan 15 2022, 11:40:53)
[Clang 13.0.0 (clang-1300.0.29.3)]

The text was updated successfully, but these errors were encountered:

zephyr111 · 2022-03-07T20:26:30Z

Hello,

Thank you for reporting this issue.

I have the same problem (-9223372036854775808 without error) on Linux using Numpy 1.20.3 and CPython 3.9.9 (GCC 11.2.0). I can also reproduce the same behavior with Numpy at the commit 08248aa (25 february) compiled with GCC 11.2.0-13.

The the Numpy main casting function defined here does not check special floating point numbers like NaN, -Inf, +Inf (nor out-of-bounds values). This is due to the basic double-to-int cast performed in this same function as is an undefined behavior in C. For more information about this behavior in the C standard, please read this.

This problem is closely related to the undefined behavior found out in #21123 .

While fixing this is relatively straightforward, the actual question is: what is the expected behavior in this case in Numpy?

Slagt · 2022-03-07T20:48:24Z

Personally, I would expect an error to be thrown. It does not really make sens to convert an entity to an integer if the entity is "not a number".

seberg · 2022-03-07T20:58:23Z

Indeed, NumPy is bad about producing floating point warnings for this kind of casts – it basically never even tried. Note that CPUs/compilers should not be so lazy: This should be a NumPy issue – although I would not be surprised if some compilers misbehave.

I.e. it should be fairly straight forward to ensure that a warning is reliably given here.

About the general undefined behaviour: It is observed fairly regularly. It would seem great to define a "correct" result, like 0 or the minimum value. But probably only if it has no major impact on speed. Unfortunately, that seems unlikely.

I am optimistically marking this as a "project", but only for the part about checking floating point warnings for casts. (There should be duplicate issues open, so we may end up consolidating and close this one though.)

EDIT: If anyone wants to dive into this. We need to copy some of the logic that ufuncs use (in umath/ufunc_object.c) for floating point handling to the cast functions.

Prakhar-mehta20 · 2022-03-08T10:33:27Z

I am getting the same error.
Running it on Kali Linux . Numpy version 1.21.5.

landonrodgers · 2022-03-23T18:46:59Z

I'm interested in working on this issue, I'll start looking into it right now

seberg · 2022-04-01T18:14:55Z

Removing the "Project" label. I started working on this (I really need at least the part that np.array([1., 2.], dtype=np.float32) + 1e300 should give an OverfloWarning in the future, where the result would be float32 and not the current float64).
And I had to realize that this is threaded to quite a bit more helpers and places (e.g. indexing), so it would have been a very hard project probably. But mainly, I need to make progress on this soon.

adeak · 2022-04-19T18:40:43Z

The results also seem to depend on the target type, see this question on Stack Overflow, which is probably also related:

npnan = np.float64(np.nan)
print(npnan.astype('int8'))   # Output: 0 
print(npnan.astype('int16'))  # Output: 0
print(npnan.astype('int32'))  # Output:          -2147483648 (i.e. -2 ** 31)
print(npnan.astype('int64'))  # Output: -9223372036854775808 (i.e. -2 ** 63)

(These are from numpy 1.21.6 or 1.22.3 on debian.)

Edit: see #21364

seberg · 2022-06-14T17:30:55Z

Going to close this. This now will give:

RuntimeWarning: invalid value encountered in cast

Which will honor the np.errstate(invalid=mode) setting. One could argue for a more extreme measure of always raising an error. But I think this is a pretty big step, so I would prefer opening a new issue on it (which is very welcome!).

xref gh-21437

Slagt added the 00 - Bug label Mar 7, 2022

seberg added the Project Possible project, may require specific skills and long commitment label Mar 7, 2022

seberg removed the Project Possible project, may require specific skills and long commitment label Apr 1, 2022

seberg closed this as completed Jun 14, 2022

jihaekor mentioned this issue Sep 5, 2023

RuntimeWarning: invalid value encountered in cast Rambatino/CHAID#137

Closed

schroedk mentioned this issue May 6, 2024

Inconsistent test behavior aai-institute/pyDVL#474

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Conversion of numpy.nan to int gives inconsistent results #21166

BUG: Conversion of numpy.nan to int gives inconsistent results #21166

Slagt commented Mar 7, 2022

zephyr111 commented Mar 7, 2022

Slagt commented Mar 7, 2022 •

edited

seberg commented Mar 7, 2022 •

edited

Prakhar-mehta20 commented Mar 8, 2022 •

edited

landonrodgers commented Mar 23, 2022

seberg commented Apr 1, 2022 •

edited

adeak commented Apr 19, 2022 •

edited

seberg commented Jun 14, 2022 •

edited

BUG: Conversion of numpy.nan to int gives inconsistent results #21166

BUG: Conversion of numpy.nan to int gives inconsistent results #21166

Comments

Slagt commented Mar 7, 2022

Describe the issue:

Expected output

Reproduce the code example:

Error message:

NumPy/Python version information:

zephyr111 commented Mar 7, 2022

Slagt commented Mar 7, 2022 • edited

seberg commented Mar 7, 2022 • edited

Prakhar-mehta20 commented Mar 8, 2022 • edited

landonrodgers commented Mar 23, 2022

seberg commented Apr 1, 2022 • edited

adeak commented Apr 19, 2022 • edited

seberg commented Jun 14, 2022 • edited

Slagt commented Mar 7, 2022 •

edited

seberg commented Mar 7, 2022 •

edited

Prakhar-mehta20 commented Mar 8, 2022 •

edited

seberg commented Apr 1, 2022 •

edited

adeak commented Apr 19, 2022 •

edited

seberg commented Jun 14, 2022 •

edited