Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nan are converted to int with slicing #4592

Closed
sinhrks opened this issue Apr 6, 2014 · 12 comments
Closed

Nan are converted to int with slicing #4592

sinhrks opened this issue Apr 6, 2014 · 12 comments

Comments

@sinhrks
Copy link

sinhrks commented Apr 6, 2014

Related to #1578, nan's are still converted to -maxint when it is assigned by slicing.

>>> np.__version__
'1.9.0.dev-6857173'
>>> i = np.array([1, 2, 3, 4, 5])
>>> i[3] = np.nan
ValueError: cannot convert float NaN to integer
>>> i[0:2] = np.nan
array([-9223372036854775808, -9223372036854775808,                    3,
                          4,                    5])
@gerritholl
Copy link
Contributor

Not only when slicing:

In [251]: array([nan, 0.0]).astype(int32)
Out[251]: array([-2147483648,           0], dtype=int32)

@charris
Copy link
Member

charris commented Jan 10, 2017

@gerritholl What would you expect?

@gerritholl
Copy link
Contributor

@charris Expect, not sure, but I would desire a warning or error configurable with seterr/errstate.

yungyuc added a commit to yungyuc/solvcon that referenced this issue Nov 27, 2017
A slice of integer ndarray allows setting numpy.nan while ndarry.__setitem__() disallows it.  See numpy/numpy#4592 .
yungyuc added a commit to solvcon/solvcon that referenced this issue Nov 27, 2017
Avoid a segfault caused by a numpy bug: numpy/numpy#4592
@figiel
Copy link

figiel commented Nov 23, 2018

Another way to stumble on this issue:

>>> np.intp(np.floor(np.nan))
-9223372036854775808

FYI on ARMv8:

>>> np.intp(np.floor(np.nan))
0

@tylerjereddy
Copy link
Contributor

ARMv8 hardware is used in our CI matrix so if there is a fix / some kind of action to be taken we should be able to detect regressions if a test is added.

@mhvk
Copy link
Contributor

mhvk commented Nov 23, 2018

@figiel's example seems very surprising. Looking a little further:

np.intp(np.nan)
# ValueError: cannot convert float NaN to integer
np.intp(np.float64(np.nan))
# -9223372036854775808
type(np.nan)
# float

So there is an odd type dependence. Possibly in the issue on top, the difference is that for the slice case the constant np.nan gets converted to an int before trying the assignment.

@rth
Copy link
Contributor

rth commented Sep 3, 2019

Related to #6109

@siddhesh
Copy link
Contributor

siddhesh commented Oct 9, 2019

It seems the only case where numpy needs a consistent definition for this otherwise undefined conversion is NaN -> NaT. I've got tests running for a patch to fix this for aarch64 since x86 just happens to do the conversion correctly to INT64_MIN.

@javidcf
Copy link
Contributor

javidcf commented Jan 30, 2020

This still happens in 1.18.1. As an additional comment, note advanced indexing (not just single-element indexing) also produces an error. As I see it, the semantics of assigning NaN to an integer array should be consistently defined, to either convert to the minimum value always or raise an error always, but the current behaviour can be quite surprising.

@WarrenWeckesser
Copy link
Member

Update: The original issue was that i[0] = np.nan generated an error, but i[0:2] = np.nan did not. I don't know when, but this appears to have been fixed:

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '1.22.0.dev0+1696.g5cc7ef066'

In [3]: i = np.array([1, 2, 3, 4, 5])

In [4]: i[0:2] = np.nan
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-6feac306eca4> in <module>
----> 1 i[0:2] = np.nan

ValueError: cannot convert float NaN to integer

However (probably because of the type dependence that @mhvk pointed out above), this does not produce an error:

In [5]: i[0:2] = np.array([np.nan, np.nan])

In [6]: i
Out[6]: 
array([-9223372036854775808, -9223372036854775808,                    3,
                          4,                    5])

@seberg
Copy link
Member

seberg commented Nov 8, 2021

Also xref gh-17495 and almost a duplicate of gh-6109, although more of a focus on the differences caused by casting vs. element setting.

The main reason is the fact that we mix casting (float64 to int64) and setting a single element from a scalar. The first tries not to error and fails to warn (this is a bug, see gh-14412.
It gets even worse, because switching to the casting behaviour for certain edge cases broke pandas. Which IIRC is why it is stricter now up there (because making it less strict would have had more affect on pandas and it was even more random before).

The one problem here (currently): item setting, doesn't know about cast safety, etc. So it cannot try to imitate normal casting. It basically uses some cast-safety that casting doesn't know about anyway. Since assignments are always unsafe, but choose to error on particularly nonsensical conversions.

@seberg
Copy link
Member

seberg commented Jun 14, 2022

NumPy will now give a warning on the main branch (settable using np.errstate). We could do more, or sanitize the output, though. I have left gh-14412 open to track that possibility (please comment there if you feel it is important).

Otherwise, closing this issue, since I think the warning is a good step in the right direction (and I am not sure whether more will be feasible, especially in the forseeable future).

@seberg seberg closed this as completed Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests