Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: .item() on a 0-dimensional datetime64[ns] array yields an integer #7619

Open
shoyer opened this issue May 11, 2016 · 10 comments · May be fixed by #23205
Open

BUG: .item() on a 0-dimensional datetime64[ns] array yields an integer #7619

shoyer opened this issue May 11, 2016 · 10 comments · May be fixed by #23205

Comments

@shoyer
Copy link
Member

shoyer commented May 11, 2016

This is obviously not useful:

In [4]: np.array(np.datetime64('2000-01-01', 'us')).item()
Out[4]: datetime.datetime(2000, 1, 1, 0, 0)

In [5]: np.array(np.datetime64('2000-01-01', 's')).item()
Out[5]: datetime.datetime(2000, 1, 1, 0, 0)

In [6]: np.array(np.datetime64('2000-01-01', 'ns')).item()
Out[6]: 946684800000000000

The underlying issue is that not all datetime64 types can be represented in datetime.datetime objects, which have fixed us precision.

We should either:

  1. Raise TypeError in all cases when calling .item() on a datetime64[ns] array.
  2. Convert to datetime.datetime if it can be done safely, raise ValueError otherwise.

My preference is for 2.

@gerritholl
Copy link
Contributor

The behaviour on float128 is also peculiar.

@OmerJog
Copy link

OmerJog commented Oct 7, 2019

Is there any progress with fixing this issue?

@eric-wieser
Copy link
Member

eric-wieser commented Oct 7, 2019

@OmerJog: We can't "fix" it until we can come to a decision about what the correct behavior is. I think my preference is for @shoyer's option 1, using the same principle that operator.index(f) is forbidden for all floats, not just non-integral ones.

@h-vetinari
Copy link
Contributor

@eric-wieser: We can't "fix" it until we can come to a decision about what the correct behavior is.

Another option in addition to the OP:
3. don't return datetime.datetime, but rather a np.datetime64

I know this is a large breaking change, but it is still the best behaviour to follow IMO. datetime.datetime are not well-embedded nor compatible in many different (performance-relevant) ways with numpy/pandas etc. Just rip off the old crusty bandaid (which is not to say that the change cannot be carefully planned/managed/executed).

@eric-wieser
Copy link
Member

@h-vetinari's suggestion is not a terrible one, and is at least somewhat consistent with how longdouble behaves. On the other hand, there's little point calling item in the first place if you're happy with a numpy scalar.

Which of the following does the suggestion apply to?

  • All datetime types
  • All datetime types with higher precision than datetime.datetime
  • Any datetime value that falls out of range of a datetime.datetime

@h-vetinari
Copy link
Contributor

My suggestion applies to all datetime types**. And the ramifications (intentionally) don't stop at .item cf. #12550.

** probably timedeltas too? Actually I tend to think that .item should probably not cast in general, but e.g. for ints / floats this is probably biting off more than we want to chew.

@eric-wieser
Copy link
Member

eric-wieser commented Dec 2, 2019

Actually I tend to think that .item should probably not cast in general

I'm starting to think the answer is "then don't use item". You can get the semantics you want with:

def scalar_from(arr):
    return arr.squeeze(axis=range(axis.ndim))[()]

or if you don't care for the weird semantics we're trying to deprecate in #10615,

def scalar_from(arr):
    assert arr.ndim == 0
    arr[()]

@h-vetinari
Copy link
Contributor

I'm starting to think the answer is "then don't use item".

Fair, but apparently .astype (which I do need to use) follows .item-semantics. And how far I'd personally turn the dial of "not casting" shouldn't change the fact that the proposal makes sense for datetimes, no?

@eric-wieser
Copy link
Member

That perhaps then is the change to make - changing the meaning of astype(object). Although I can perhaps see that opens a can of worms.

@h-vetinari
Copy link
Contributor

Also fair, but probably better discussed in #12550.

In this sense, it's good to have separate issues, because even if astype(object) were fixed to follow different semantics, the problem of the OP here would persist (so not exactly duplicate).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants