Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Ensure np.vectorize outputs can be subclasses. #19356

Merged
merged 1 commit into from Jul 6, 2021

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented Jun 26, 2021

As is, this is true for the ufunc case, but not for the gufunc case,
even if the underlying function does produce a subclass. Given the
care taken to ensure inputs are kept as subclasses, this is almost
certainly an oversight, which is here corrected.

p.s. Not quite sure whether to count this as a bugfix or API change, given that the element-wise part of vectorize already correctly dealt with subclasses.

xref: astropy/astropy#11893

As is, this is true for the ufunc case, but not for the gufunc case,
even if the underlying function does produce a subclass.  Given the
care taken to ensure inputs are kept as subclasses, this is almost
certainly an oversight, which is here corrected.
@mhvk mhvk force-pushed the functionbase-vectorize-refactor branch from 96b8df0 to 3c892cd Compare June 26, 2021 22:59
@seberg
Copy link
Member

seberg commented Jun 27, 2021

Hmm, I wonder if this is chasing the unreachable, and the correct thing would be to properly implement gh-14020... I.e. we don't yet call __array_prepare__ nor __array_wrap__ quite correctly, and also do not really support __array_ufunc__.

On the other hand: it seems fine to just do this as a step in the right direction. But if that seems better, I could also clock a bit of time on gh-14020, since I don't think it should be particularly difficult.

@mhvk
Copy link
Contributor Author

mhvk commented Jun 27, 2021

Even I was wondering about chasing too far, but in this case the solution is both very small and seemed so obviously consistent with what the code that was there was trying to do anyway, that I thought I might as well push it up.

p.s. An advantage over the approach to make the function a gufunc is that if the function already deals with subclasses, this implementation deals with it correctly on its own - for Quantity at least, the route via a ufunc implies more work, since now the ufunc has to be recognized in Quantity.__array_ufunc__...

Anyway, long story short, let's not let the perfect be the enemy of the good (enough)!

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good. As far as I can tell:

  1. This will fail if broadcast_shape = (0,) since you don't have any results available.
  2. It means you explicitly diverge further from normal gufuncs (in a way that gufuncs can probably never support).

I am happy with this if you think it is practical, though. We probably cannot make np.vectorize(..., signature=...) a proper ufuncs in any case, and even getting it much closer will be hard.

I.e. I think we may be able to grow signature and maybe proper dtype support to np.frompyfunc (if the user is explicit), but that won't really help with making np.vectorize more of a ufunc...

arrays = tuple(np.empty(shape, dtype=dtype)
for shape, dtype in zip(shapes, dtypes))
if dtypes is None:
dtypes = [None] * len(shapes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch does not make sense if results is None, maybe just move it into that branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in practice it is not currently possible to get here, since it either gets called with results or with explicit dtypes, but in the context of this helper function it seems clearer to just allow dtypes=None for both cases (and I need a list to be able to iterate over them).

If you think it better, though, since the function is now effectively different for the two paths insize vectorize, I could just inline it there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter, I just noticed it. What made me pause was just that in that path dtype=None effectively gives dtype=np.dtype(None) which is float64.

@mhvk
Copy link
Contributor Author

mhvk commented Jul 1, 2021

On the larger points: for broadcast=(0,) it will indeed not return a subclass, since it cannot know what that class would be. I thought this was a sufficient corner case (especially since the dtypes need to be given as well) that it was not quite worth worrying about...

I'm not sure, though, that this is diverging from gufuncs - with those, the subclasses could catch execution in __array_ufunc__ and thus do what they please. And I'd think that generally they'd produce a subclass of the same type, which would be covered here too. Of course, for my practical purpose of Quantity, this will likely be the case too, but the unit might well be different - and that is nicely captured here.

@charris
Copy link
Member

charris commented Jul 3, 2021

I'm inclined to put this in. Comments? @mhvk What about the dtype=None comment above?

@mhvk
Copy link
Contributor Author

mhvk commented Jul 3, 2021

@charris - thanks, I think @seberg and I converged there, that in the context of that routine it was fine. EDIT: In practice, the default of None will only happen in the empty_like branch, where it means the default will be take from the results array. [The following was incorrect The default of float will only happen for the zero-size array case, when there is no other information anyway (and it is unchanged from what it was before)]

@seberg
Copy link
Member

seberg commented Jul 6, 2021

Yeah, its fine. It was just a "take it or leave it" comment. If you are also inclined to put it in, lets give it a shot. Thanks Marten!

@seberg seberg merged commit 3d83143 into numpy:main Jul 6, 2021
@mhvk mhvk deleted the functionbase-vectorize-refactor branch July 6, 2021 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants