Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG,DEP: Allow (arg-)partition to accept uint64 indices #20000

Merged
merged 6 commits into from Oct 3, 2021

Conversation

BvB93
Copy link
Member

@BvB93 BvB93 commented Sep 29, 2021

Closes #19832

This PR introduces two changes:

  • It fixes an issue wherein the partition functions would previously not accept uint64 indices, an issue caused by what was effectively a np.can_cast(kth, np.intp, casting="same_kind") check.
  • The same casting-check would previously allow booleans to pass through, converting them to np.intp and thus treating them as normal numerical indices. The passing of booleans to the kth parameter has now been deprecated.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thanks! I am happy with deprecating bools as well, since the new behaviour would mirror indexing.
(Just in case someone wonders: The casting can lead to integer overflows during the cast in theory, this is something we have to live with right now unfortunately, we do the same in indexing.)

return NULL;
}
}
else if (!PyArray_ISINTEGER(ktharray)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think we may have to switch hat over to a same-kind casting (possibly with a time exception). But it is what indexing currently uses, so I think it is fine. (This would not allow custom user integer DTypes.)

Copy link
Member Author

@BvB93 BvB93 Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future refernce: I wonder what would be the best way of dealing with custom user integer DTypes?
Considering PyArray_ISINTEGER is too strict and np.can_cast(dtype, np.intp, casting="same_kind") won't even accept all the standard numpy integer dtypes (i.e. np.uint64).

EDIT: Updated the can_cast example

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All NumPy integers should pass fine? The main problem is that bools and datetimes may also. If same-kind casting doesn't do the trick, we could also consider adding an abstract Indexer DType that users have to register or implement promotion with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All NumPy integers should pass fine?

Ah, to clarify: I was talking specifically about np.uint64 in the can_cast example.

If same-kind casting doesn't do the trick, we could also consider adding an abstract Indexer DType that users have to register or implement promotion with.

I could see this potentially being usefull, as this would also make it easier to differentiate between (user-defined) bool dtypes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [1]: np.can_cast(np.uint64, np.int32, casting="same_kind")
Out[1]: True

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, you're completelly right.
I may have accidently mixed up "safe" and "same_kind" in my head...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you are right, an abstract IndexingDType may be a good way to add some useful formalism and better generalization.

numpy/core/src/multiarray/item_selection.c Outdated Show resolved Hide resolved
… warning

Co-Authored-By: Sebastian Berg <sebastian@sipsolutions.net>
@seberg
Copy link
Member

seberg commented Oct 3, 2021

Thanks Bas, if no-one else has any concerns, lets put this in! (I suppose we should probably ping the mailing list very briefly about this, but I feel it is just a formality.)

@seberg seberg merged commit 96d5993 into numpy:main Oct 3, 2021
@BvB93 BvB93 deleted the partition branch October 18, 2021 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: partition/argpartition do not accept uint64 indices
2 participants