Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Ensure that casting does not affect ufunc loop #18880

Merged

Conversation

seberg
Copy link
Member

@seberg seberg commented Apr 30, 2021

This ensures that casting does not affect ufunc loops, which means
that the following will give the same result:

>>> np.not_equal(None, False, dtype=bool)
True

>>> np.not_equal(None, False, dtype=bool, casting="unsafe")
False

This is absolutely necessary to make new promotion and type resolution
sane.

In some cases, it causes problems with existing weirder dtype usage,
the workaround is that we allow the promoter (in current terms the
type resolver) to resolve more strictly (potentially ignoring input
types).
I.e. a promoter (after not finding match), can use the fixed (signature)
dtypes to infer the loop.

Because this makes things more strict, most importantly nextafter
causes problems. Code like:

np.nextafter(0., np.inf, dtype=np.float32)

looks harmless enough, but requires such preferential treatment. NumPy
has such capabilities currently for the homogeneous type resolver, so
changing to that fixes it (SciPy uses the above quite a lot).

SciPy also has code like (in the tests):

np.nextafter(0, 1)

which will now FAIL. However, this is for the better since the above code
is actually buggy: It will return different values on windows and 32bit linux
because it will find the float32 instead of the float64 loop. That is almost
certainly not the expected result.


This is a bit tricky, and on the other hand fairly straight forward in the sense that I think we have to give it a shot right now if we want sane future behaviour... I am planning to add a few tests (and release note), but I think it should be ready for review.

This ensures that casting does not affect ufunc loops, which means
that the following will give the same result:

```
>>> np.not_equal(None, False, dtype=bool)
True

>>> np.not_equal(None, False, dtype=bool, casting="unsafe")
False
```

This is absolutely necessary to make new promotion and type resolution
sane.

In some cases, it causes problems with existing weirder dtype usage,
the workaround is that we allow the promoter (in current terms the
type resolver) to resolve more strictly (potentially ignoring input
types).
I.e. a promoter (after not finding match), can use the fixed (signature)
dtypes to infer the loop.

Because this makes things more strict, most importantly `nextafter`
causes problems. Code like:
```
np.nextafter(0., np.inf, dtype=np.float32)
```
looks harmless enough, but requires such preferential treatment.  NumPy
has such capabilities currently for the homogeneous type resolver, so
changing to that fixes it (SciPy uses the above quite a lot).

SciPy also has code like (in the tests):
```
np.nextafter(0, 1)
```
which will now FAIL.  However, this is for the better since the above code
is actually buggy: It will return different values on windows and 32bit linux
because it will find the float32 instead of the float64 loop. That is almost
certainly not the expected result.
@seberg seberg added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Apr 30, 2021
@seberg
Copy link
Member Author

seberg commented Apr 30, 2021

I forgot to explain what this code "actually does" so to speak.

When the user specifies an exact DType for a loop, that dtype must be used and match exactly. So in that case, we do not need to check whether the (input) array can cast safely to pick the loop (the loop still has to match the casting safety, but that defaults to "same kind"!).

However, here I enforce that any operand that has not been specified must match "safely" (for inputs, for picking a loop, outputs really don't/shouldn't matter...).
This is slightly more strict, in practice, the current loop can pick the "single reasonable loop" so to speak (by picking the first, which leads to the above mentioned casting="unsafe" bug). In a world where we want users to add loops freely, that isn't really a good option though. Any such logic has to be handled in the "promotion" step, which is what I effectively do here for nextafter and divide.

@seberg seberg added this to the 1.21.0 release milestone Apr 30, 2021
@seberg
Copy link
Member Author

seberg commented Apr 30, 2021

Well, floor_divide doesn't have a ??->? loop or any ??->something loop at all, but works by using the bb->b loop (int8), I guess we need to just special case that?

@seberg seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Apr 30, 2021
@seberg
Copy link
Member Author

seberg commented Apr 30, 2021

A maybe noteworthy example, that hopefully clarifies the behaviour rather than confusing it:

np.add(arr*4, arr, dtype=np.float32) - np.add(arr*4, arr).astype(np.float32)
# array([0.000000e+00, 0.000000e+00, 9.536743e-07], dtype=float32)

is not 0 currently because the first version downcasts the inputs, while the second downcasts the output. (Generally, I think downcasting the output would be better).

This PR "refuses the temptation to guess" so to speak, i.e. if np.add was not a homogeneous operation an error would be raised, forcing the user to pass signature=(np.float32,) * 3 to ignore the input array dtypes. (But here, NumPy explicitly chooses to do the downcast when dtype=np.float32 is given.)

The other complexity is what dtype=type exactly means when compared with signature=(type, type, type). This was the change I did previously already, where before it usually meant (type, type, type) but often would choose (None, None, type) instead. While now it usually means (None, None, type) but "homogeneous" resolvers choose to do the same as for (type, type, type).

(My problem there is that I really don't want to things, so I have to make a choice into what to convert dtype= as a signature, and converting it to (type, type, type) always would be more diruptive than the alternative – and frankly less easy to understand in my opinion)

@seberg
Copy link
Member Author

seberg commented May 1, 2021

I suppose we could special case nextafter for integers to always go to doubles if anyone thinks it helps with transitions.

I just realized that int32 cannot cast safely to float32, so np.nextafter(1, 2) will always return the double version np.nextafter(np.int16(1), np.int16(2)) will currently go to float32...

@seberg
Copy link
Member Author

seberg commented May 1, 2021

I checked astropy and astropy.nddata.bitmask has two problems, but one is at the module level (trivial to fix and the fix is even nicer code, but still).

I have to think a bit more. But one solution could be to modify the meaning of signature=(None, None, dtype) as identical to dtype=dtype, BUT if the first sweep using signature=(None, None, dtype) fails, we then still always try also signature=(dtype, dtype, dtype). In the future certain ufuncs could opt out, but most of the time that should be unproblematic.

That might even allow to undo the nextafter and division changes although I think thats probably the right thing to do, maybe its easier to do the right thing in the future with new promoters...

EDIT: There are ufuncs (in scipy) that have signatures like dl->d and dd->d, for such functions we might pick the dd->d loop with dtype="d" and certain positions of the stars, when the dl->d loop could be slightly better (but we have no way of matching that one). That is probably for ufunc(float_arr, int_arr, dtype="d", casting="unsafe") the "less good" loop is chosen unless scipy adds a promoter (in the future) to customize the behaviour.

@mhvk
Copy link
Contributor

mhvk commented May 1, 2021

@seberg - I read through it and I'll try to look a the code later. But to understand better, the current documentation for dtype reads

dtype

    New in version 1.6.

    Overrides the dtype of the calculation and output arrays. Similar to signature.

Which seems quite explicit, and in the direction of being equivalent to signature=(type, type, type) (though clearly inaccurate!). How should it read with your changes?

@seberg
Copy link
Member Author

seberg commented May 1, 2021

Thanks. In the signature normalization PR, I actually had modified it to signature=(None, None, dtype) except for certain ufuncs. Previously, it was "overrides the dtype of the calculation except for binary comparisons".

The old behaviour is roughly:

  • dtype=dtype behaves like signature=(dtype, dtype, dtype) except for binary comparisons.

Right now I made it:

  • dtype=dtype beahves like signature=(None, None, dtype) except for homogeneous signature ufuncs (those using the type-resolver), which make signature=(None, None, dtype) back into signature=(dtype, dtype, dtype)

So my last thought was, to try both signature=(None, None, dtype) but then also try signature=(dtype, dtype, dtype). So that dtype forces the output(s) dtype and also tries forcing the input dtypes if no loop for the input dtypes is found.

Maybe I thought about it wrong though. I prefer the "output" interpretation (and I think documentation or not, that is likely what most would guess).

If I revisit the previous change, the main issue is that binary comparisons behave slightly different with ufuncs. If we special case them (for a bit) signature=(dtype, dtype, dtype) could be kept around. And dtype=object would be fully deprecated for binary comparisons.

The one thing I don't really like would be to try to keep the special case that dtype=dtype cannot be normalized into signature=something.

EDIT: Sorry for being so confusing, the fact that the old code paths very quite a bit from ufunc to ufunc is confusing to me also. One further note is that a reason I liked the change to dtype meaning signature=(None, None, dtype) (outputs fixed), is that it doesn't collide with ufuncs like np.ldexp (and other scipy ones), that include type signatures like dl->d.

@mhvk
Copy link
Contributor

mhvk commented May 1, 2021

OK, that helps at least get my thoughts a bit in order!

But let me try to think through a bit more, and try to see what strictly following the docstring mentioning that it is the output dtype and calculation might do. For comparisons, the dtype argument then can only be bool or object, where the latter means you get an array of python bools. I agree that then the loop logically has to be looked up on the inputs, and that the casting argument should not matter for your example of np.not_equal(None, False) - if one wanted to cast the inputs to bool, one would have to pass in signature='??->?, casting='unsafe'.

But with this in mind, I am confused by your example in #18880 (comment):

np.add(arr*4, arr, dtype=np.float32) - np.add(arr*4, arr).astype(np.float32)
# array([0.000000e+00, 0.000000e+00, 9.536743e-07], dtype=float32)

This seems expected: in one case the user explicitly asks for float32 output (and calculation), and given that there is no 'dd->f' I think that implies that the inputs need to be cast. I would tend not to try to fix that, mostly because I think the opposite case is more important. For instance, if a user does:

np.multiply(a, b, dtype=float64)

I think the intent would be clear: I want my calculation done in float64 even if a and b are both float32 (maybe because I expect infinities otherwise).

p.s. I think np.nextafter is a special case: essentially, one should allow unsafe casting of the second argument to that of the first, and the output dtype has to be equal to the input dtype. Definitely not worth changing the regular rules for!

@seberg
Copy link
Member Author

seberg commented May 1, 2021

@mhvk Yeah, I have to think about it more and maybe map out the space of possibilities a bit clearer. One thing about the add example, I am considering a situation where we have additional loops:

ff->f
dd->f
dd->d
ff->d

If the user says dtype=f we could pick the dd->f loop or the ff->f loop depending on the input dtypes in theory. It is trickier in practice, since you need to define a clear rule. Which probably could be either: Use the common dtype for all unspecified dtypes OR use the specified dtype for all loop dtypes (or both in a given order, prefering either ff->f or dd->f when the input is df->f! with the exclanation mark meaning that the last f is truly fixed).
EDIT: This rule only needs to be applied if the inputs are not dd or ff already of course.

In either case the np.multiply(a, b, dtype=float64) example can never choose the float32 loop, because the output dtype must strictly be float64 (unless you also provide out=float32_arr).
It could choose the hypothetical ff->d loop, but if someone adds such a loop they should guarantee that the operation happens in float64.

(Not sure it helps, but the main issue is about cases where it seems like it should be obvious which loop to choose, but I am not sure how to choose it without using the fact that loops are currently ordered.)

If only output dtypes are provided, we add the additional step
in the ufunc type resolver of converting a signature of the format::

    (None,) * nin + (DType,) * nout

into::

    (DType,) * (nin + nout)

And try searching for a matching loop again.  The advantage of this
is that IF the ufunc has mixed-type inputs, we have a fighting
chance of finding a good loop for `dtype=DType`.
Further, future `promoters` could refine such logic, while at the
same time, we do not need to use completely different code paths
for `signature=...` and `dtype=...`.

The rare downside is that we might pick a `dd->d` loop when a
`dl->d` loop would do (SciPy has such ufuncs).
The up-side is, that before, using `dtype=...` was not even possible
for functions with non-homogenous loops like `np.ldexp`.

The future will hopefully see more precise promotion logic, and some
of the more surprising paths simply deprecated...
@seberg
Copy link
Member Author

seberg commented May 3, 2021

OK, I think I mostly converted to the solution of modifying the meaning of dtype=DType, but falling back to the old behaviour. A bit because it is useful sometimes and mostly for backward compatibility.

That means that if only output dtypes are provided in the signature, we add the additional step in the ufunc type resolver of converting a signature of the format::

(None,) * nin + (DType,) * nout

into::

(DType,) * (nin + nout)

And try searching for a matching loop again. This falls back to the old version, unless it finds a loop, which presumably should return a correct result. (After all all inputs can safely cast to the loop.)

The main advantage of this is that if the ufunc has mixed-type inputs, we have a fighting chance of finding a good loop for dtype=DType. And the way this is done could be customized in the future.

This is the contrast with the main alternative, which is translating dtype=DType directly into signature=(DType,) * (nin+nout). If we do that, the dtype argument is completely useless for ufuncs without homogeneous type signatures. (This is fairly rare, e.g. comparisons have it, but are special. Division has it for timedelta, although timedelta is a bit special, and np.ldexp and some functions have integer arguments mixed with floats).

There are some disadvantages, of course:

  • The loop that gets picked becomes less obvious, since we might pick up the homogeneous loop when a different loop arguably "fits better"
  • In the future, this logic will have to be duplicated by the promoters to the ufunc. Those can also refine the behaviour, but likely will need to deprecate other parts.
  • It slightly the trust we put into "additional loops". We trust that nobody will add an ff->d loop, that does the computation in single precision. (An dd->f loop that does the computatoin in double and then downcasts is interesting as it would improve precision.). I think this is a gamble I am willing to make, quite frankly less-precise loops are bound to crop up if we make it easier to replace/augment NumPy loops (e.g. fastmath flags). Hopefully, we could channel such problems into new ufuncs rather than into "augmentation" of existing ones...

I can be convinced of going the signature=(DType,) * ( nin+nout) route, but once we are there, it might be fairly tricky to relax things again.

Also, for now enforce *all* outputs. That is more conservative, and
obviously won't work for certain functions in scipy that have multiple
outputs of different types, but it seems a bit safer to me right now.

Maybe at some point we have an idea of how to "modify" what the `dtype`
means.  I am not sure how!  By allowing the ufunc itself to interpret
how to translte it to a `signature`?  By giving it a distinct meaning?

Yes, this might narrow us down a bit... but...
@seberg
Copy link
Member Author

seberg commented May 3, 2021

I am sorry about adding a "slight" new change. I did now enforce all outputs for dtype=DType (in the case of multiple output arrays).

In theory, SciPy has some ufuncs which have multiple outputs of different DTypes in scipy.special, so those functions would still be blocked from using dtype= for anything useful. But this seemed the most reasonable trade-off to me right now, unless we had some idea how to decouple dtype from signature entirely, and I don't have such an idea.

@seberg seberg removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label May 3, 2021
@rossbar rossbar self-requested a review May 3, 2021 21:53
Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read through the discussion, I went through the changes to see how well I understood the intent and to what degree the documentation updates made that intent clear. I had a couple of questions/suggestions for sentences that could be polished, but generally I think the documentation describing the change is looks good!

Comment on lines 14 to 15
Since NumPy special cases if only outputs (or ``dtype``) is provided,
this should affect very few users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpreted this to mean "The behavior change only affects partial signatures where at least one input type is provided, so the change should affect very few users". I'm not sure that my interpretation is correct though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is what I meant, and I think it is probably true :). It definitely is the mental model I want users to use for now (which will immediately tell them, that practically nobody will notice).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 in that case it might be worth rewording this sentence since IMO it requires a bit of extra context to correctly interpret the "special casing" here. It could just be me though :)

doc/source/reference/ufuncs.rst Outdated Show resolved Hide resolved
@@ -553,6 +553,11 @@ PyUFunc_SimpleUniformOperationTypeResolver(
* This is a fast-path, since all descriptors will be identical, mainly
* when only a single descriptor was passed (which would set the out
* one in the tuple), there is no need to check all loops.
* Note that this also allows (None, None, float64) to resolve to
* (float64, float64, float64), even when the inputs do not match,
* i.e. fixing a single part of the signature can fix all of them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say "i.e. fixing the output of the signature can determine the dtype(s) for the rest of the signature"? Not that this would ever happen, but the current wording makes it seem like (None, float64, None) would also fall back to (float64, float64, float64).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right here. For the other changes, this is not what happens, but for the simpleuniform one, that is actually what I coded... I don't mnind that behaviour, but there is also no real reason for it. Let me modify the code to align with the rest and only apply the "uniform" expansion to outputs.

@mattip mattip added this to In Progress in Ufunc refactor May 5, 2021
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked again and think the code and description are clear now. Only some totally nitpicky comments about docstrings and code comments.

to no loop being found in some cases, NumPy will normally also search
for the loop::

signature("float64", "float64, "float64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing end double quote on second item.


This argument allows the user to specify exact DTypes to be used for the
calculation. Casting will be used as necessary. The input DType
is not considered unless ``signature`` is ``None`` for that input.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "The actual DType of the input arguments is not considered ..." (otherwise, it is a bit unclear what "input DType" means)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted (replaced "argument" with "array" also).

operation. It does not specify the ``datetime64`` time-unit or the
``float64`` byte-order.

For backwards compatibility this argument can also be provided as *sig*,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope here, but time to remove sig, I'd think!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, we never deprecated it with a warning. But maybe that is about time...

out_dtype, types, NULL);
out_dtype, orig_types, NULL);
/*
* In principle, we only need to validate the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that in future we could simplify this? If so maybe add "TODO" in front?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My TODO here is "replace practically everything in this file and then delete it". It is a bit unfortunate, but to get there, I have to fix it first...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 Yes, better to evolve using small steps...

/* Error */
case -1:
return -1;
/* Found matching loop */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align same as /* Error */

for (i = 0; i < self->ntypes; ++i) {
char *orig_types = self->types + i*self->nargs;

/* Check specified types and copy into an int array for matching */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe note this code appears twice and could be put into a small function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a note that its duplicated. Its not quite trivial to refactor because orig_types is char * in one case and int *.

types[j] = orig_types[j];
}
set_ufunc_loop_data_types(self, op, out_dtype, types, NULL);
/* In principle, we only need to validate the NPY_NOTYPE ones */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, is this meant to be a TODO?

Comment on lines 3 to 4
The behaviour for ``signature=...`` and ``dtype=...`` can differ in
some cases to the previous behaviour.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The behaviour for ``signature=...`` and ``dtype=...`` can differ in
some cases to the previous behaviour.
The behaviour for ``np.ufunc(1.0, 1.0, signature=...)`` and ``np.ufunc(1.0, 1.0, dtype=...)``
can yield different loops in 1.21 than in 1.20 because of changes in promotion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted as:

The behaviour for np.ufunc(1.0, 1.0, signature=...) or
np.ufunc(1.0, 1.0, dtype=...) can now yield different loops in 1.21 compared
to 1.20 because of changes in promotion.

@mattip mattip moved this from In Progress to Ready for final review in Ufunc refactor May 6, 2021
In that case, it is necessary to provide the complete signature
to enforce casting the inputs.
If ``dtype="float64"`` is used or only outputs are set (e.g.
``signature=(None, None, "float64")`` the behaviour should remain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
``signature=(None, None, "float64")`` the behaviour should remain
``signature=(None, None, "float64")`` the is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adopted, thanks

Copy link
Member

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, but otherwise LGTM.

Co-authored-by: Matti Picus <matti.picus@gmail.com>
@charris
Copy link
Member

charris commented May 7, 2021

close/reopen

@charris charris closed this May 7, 2021
@charris charris reopened this May 7, 2021
@seberg seberg requested a review from mattip May 9, 2021 18:18
@seberg
Copy link
Member Author

seberg commented May 13, 2021

Can we put this in so that I can do the ugly deed of squash merging gh-18905?

@mattip mattip merged commit 5eb5802 into numpy:main May 14, 2021
@mattip
Copy link
Member

mattip commented May 14, 2021

Thanks @seberg

@seberg seberg deleted the make-signature-resolve-more-strict-and-fix-nextafter branch May 14, 2021 14:32
@mattip mattip removed this from Ready for final review in Ufunc refactor May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
30 - API 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants