Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Adding keepdims to np.argmin,np.argmax #19211

Merged
merged 38 commits into from Jul 7, 2021
Merged

Conversation

czgdp1807
Copy link
Member

Closes #8710

Comments

The approach is too straightforward/trivial. Please let me know if the changes should be made at C-level, somewhere in PyArray_ArgMax.

ping @eric-wieser @shoyer
cc @rgommers

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @czgdp1807.

This indeed looks deceptively simple - if it works, then it'd be fine with me to keep this PR in Python. Support in the C API could be added as a follow-up step.

The axis=None, keepdims=True case doesn't seem tested though, and it looks like that'll need some special handling.

numpy/core/fromnumeric.py Outdated Show resolved Hide resolved
@czgdp1807
Copy link
Member Author

This indeed looks deceptively simple

Agreed.

The axis=None, keepdims=True case doesn't seem tested though

Ah! Yes. Missed it. Thanks for pointing out. I am on it.

numpy/core/fromnumeric.py Outdated Show resolved Hide resolved
numpy/core/fromnumeric.py Outdated Show resolved Hide resolved
Copy link
Member

@BvB93 BvB93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the argmax/argmin annotation as well (see fromnumeric.pyi)?
It should be fairly straightforward by the looks of things.

@overload
def argmax(
    a: ArrayLike,
    axis: None = ...,
    out: Optional[ndarray] = ...,
+   keepdims: Literal[False] = ...,
) -> intp: ...
@overload
def argmax(
    a: ArrayLike,
    axis: Optional[int] = ...,
    out: Optional[ndarray] = ...,
+   keepdims: bool = ...,
) -> Any: ...

@czgdp1807
Copy link
Member Author

czgdp1807 commented Jun 10, 2021

Please let me know if there are any cases where the current approach fails or is inefficient.

@rgommers
Copy link
Member

Ah, other issue: this introduces a mismatch between np.argmax and np.ndarray.argmax. That is undesirable. I think that means the changes can't be Python-only after all.

@czgdp1807
Copy link
Member Author

Yeah. I was thinking so, while making the change. Noticed that, np.nadarray.max has keepdims argument. So, for overall consistency np.ndarray.argmax should also have keepdims. I will investigate where the changes might be needed to make things work.

@czgdp1807
Copy link
Member Author

I have made some updates to the interface of argmax and argmin. Let me know if any changes are required.

@czgdp1807 czgdp1807 changed the title ENH: Adding keepdims to np.argmin,np.argmax [WIP] ENH: Adding keepdims to np.argmin,np.argmax Jun 10, 2021
@rgommers
Copy link
Member

I have made some updates to the interface of argmax and argmin. Let me know if any changes are required.

Did you mean to touch MaskedArray methods? For ndarray methods I think you need to change array_argmax and PyArray_ArgMax.

@czgdp1807
Copy link
Member Author

czgdp1807 commented Jun 10, 2021

Did you mean to touch MaskedArray methods?

I thought that, MaskedArray.max has keepdims, so added it to MaskedArray.argmax. Do we have to avoid making changes in it? I just looked for places where max has keepdims but argmax doesn't and for consistency made the changes to argmax.

Do we need to use, np._NoValue for keepdims in MaskedArray methods?

@rgommers
Copy link
Member

Ah no, if it was intentional then it's all good! Updating numpy.ma at the same time as the functionality in the main namespace is great.

@rgommers
Copy link
Member

Do we need to use, np._NoValue for keepdims in MaskedArray methods?

Yes, it looks like that's necessary. It looks odd; the reason for it is to support (partial) subclasses. Say .method1(..., keepdims=False) would internally call self.method2(keepdims=keepdims) and the subclass overrides method2 but not method1, then an update to the newer numpy version which adds the keepdims keyword would break the subclass.

numpy/__init__.pxd Outdated Show resolved Hide resolved
@BvB93 BvB93 added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 11, 2021
@czgdp1807
Copy link
Member Author

@BvB93 I have added a release notes file to the PR. Are there any improvements to be made to that?

@czgdp1807
Copy link
Member Author

czgdp1807 commented Jun 21, 2021

Why were all the annotations removed?

keepdims is now a keyword only argument (refer #19211 (comment)) and hence I removed annotations for it. I checked some other functions (like, any) which are using keyword only arguments and saw that they were not having annotations for keyword arguments. Let me know the format to add annotations for these, and I will add them. Thanks.

Edit - Apologies for resolving the conversations here. Actually, I do it just keep track of stuff which is yet to be addressed. Let me know if I should un-resolve them.

@BvB93
Copy link
Member

BvB93 commented Jun 21, 2021

keepdims is now a keyword only argument (refer #19211 (comment)) and hence I removed annotations for it. I checked some other functions (like, any) which are using keyword only arguments and saw that they were not having annotations for keyword arguments. Let me know the format to add annotations for these, and I will add them. Thanks.

I suspect you might be confusing keyword-only (func(a, *, b=...)) with variadic keyword arguments (func(a, **kwargs)). The latter is often annotated as Any, simply due to a lack of suitable and/or convenient alternatives. On the contrary, the former is (arguably) even easier to annotate than normal keyword-or-positional arguments.

In any case, compared to the annotations prior to 9260d40 only a single change will have to be made:
a * must be added preceding the keepdims keyword in order to mark the argument as keyword-only.

@overload
def argmin(
    a: ArrayLike,
    axis: None = ...,
    out: Optional[ndarray] = ...,
    out: Optional[ndarray] = ...,
+    *,
+    keepdims: Literal[False] = ...,
) -> intp: ...
@overload
def argmin(
    a: ArrayLike,
    axis: Optional[int] = ...,
    out: Optional[ndarray] = ...,
+    *,
+    keepdims: bool = ...,
) -> Any: ...

@czgdp1807
Copy link
Member Author

I have restored the annotations. Let me know if anything else is required. Thanks.

@czgdp1807 czgdp1807 requested a review from seberg June 23, 2021 03:42
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, a few nitpicks, but most of them are "take it or leave it". The test looks very thorough. Unfortunately it is a bit unwieldy, but I am OK with leaving it as it is. Thanks for bearing with me trying to find a minimal solution!

One awesome thing (happy with followup or skipping), would be to unify the two functions completely. They seem 100% copy-paste of each other with 2 variations (the method, and the "argmax" vs. "argmin" in error messages). And with PyErr_Format these days, both of those should be straight forward to unify.

It might be good to ping the mailing list briefly, also about the C-API addition (which I honestly don't care about personally). Those should maybe also have a very brief (just a bullet point is enough) additional release note under the C-API section.

Anyway, this looks good to me, thanks Gagandeep!

doc/release/upcoming_changes/19211.new_feature.rst Outdated Show resolved Hide resolved
doc/release/upcoming_changes/19211.new_feature.rst Outdated Show resolved Hide resolved
doc/release/upcoming_changes/19211.new_feature.rst Outdated Show resolved Hide resolved
numpy/core/fromnumeric.py Outdated Show resolved Hide resolved
numpy/core/src/multiarray/calculation.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/calculation.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/calculation.c Outdated Show resolved Hide resolved
numpy/core/tests/test_multiarray.py Show resolved Hide resolved
@czgdp1807
Copy link
Member Author

One awesome thing (happy with followup or skipping), would be to unify the two functions completely. They seem 100% copy-paste of each other with 2 variations (the method, and the "argmax" vs. "argmin" in error messages). And with PyErr_Format these days, both of those should be straight forward to unify.

I am thinking to do bit of refactoring (in C and test files) after this PR gets in. Let me know if that works.

It might be good to ping the mailing list briefly, also about the C-API addition (which I honestly don't care about personally). Those should maybe also have a very brief (just a bullet point is enough) additional release note under the C-API section.

I have added a release notes entry for C-API too. Let me know if it is okay.

Thanks for bearing with me trying to find a minimal solution!

In my opinion the solution is really elegant, simple and abstract. Thanks for guiding me to finally arrive at it. :-)

@czgdp1807 czgdp1807 removed the 63 - C API Changes or additions to the C API. Mailing list should usually be notified. label Jul 2, 2021
@czgdp1807
Copy link
Member Author

I have removed the C-API label as well. Thanks.

Co-authored-by: Eric Wieser <wieser.eric@gmail.com>
Co-authored-by: Eric Wieser <wieser.eric@gmail.com>
@seberg
Copy link
Member

seberg commented Jul 7, 2021

OK, lets put this in. Thanks Gagandeep! @czgdp1807, I think you did not apply all style fixes in both places always. I assume you will followup with a PR to merge the two functions into a single one that will clean up the styles where it is nicer?

@seberg seberg merged commit 6568c6b into numpy:main Jul 7, 2021
@czgdp1807
Copy link
Member Author

I assume you will followup with a PR to merge the two functions into a single one that will clean up the styles where it is nicer?

Yes sure. On the way. I will try to factor out the common parts in argmin, argmax and any other functions in the main code and tests. Thanks for merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. component: numpy._core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add keepdims argument to argmin and argmax
5 participants