API: Ensure that casting does not affect ufunc loop #18880

seberg · 2021-04-30T19:03:00Z

This ensures that casting does not affect ufunc loops, which means
that the following will give the same result:

>>> np.not_equal(None, False, dtype=bool)
True

>>> np.not_equal(None, False, dtype=bool, casting="unsafe")
False

This is absolutely necessary to make new promotion and type resolution
sane.

In some cases, it causes problems with existing weirder dtype usage,
the workaround is that we allow the promoter (in current terms the
type resolver) to resolve more strictly (potentially ignoring input
types).
I.e. a promoter (after not finding match), can use the fixed (signature)
dtypes to infer the loop.

Because this makes things more strict, most importantly nextafter
causes problems. Code like:

np.nextafter(0., np.inf, dtype=np.float32)

looks harmless enough, but requires such preferential treatment. NumPy
has such capabilities currently for the homogeneous type resolver, so
changing to that fixes it (SciPy uses the above quite a lot).

SciPy also has code like (in the tests):

np.nextafter(0, 1)

which will now FAIL. However, this is for the better since the above code
is actually buggy: It will return different values on windows and 32bit linux
because it will find the float32 instead of the float64 loop. That is almost
certainly not the expected result.

This is a bit tricky, and on the other hand fairly straight forward in the sense that I think we have to give it a shot right now if we want sane future behaviour... I am planning to add a few tests (and release note), but I think it should be ready for review.

This ensures that casting does not affect ufunc loops, which means that the following will give the same result: ``` >>> np.not_equal(None, False, dtype=bool) True >>> np.not_equal(None, False, dtype=bool, casting="unsafe") False ``` This is absolutely necessary to make new promotion and type resolution sane. In some cases, it causes problems with existing weirder dtype usage, the workaround is that we allow the promoter (in current terms the type resolver) to resolve more strictly (potentially ignoring input types). I.e. a promoter (after not finding match), can use the fixed (signature) dtypes to infer the loop. Because this makes things more strict, most importantly `nextafter` causes problems. Code like: ``` np.nextafter(0., np.inf, dtype=np.float32) ``` looks harmless enough, but requires such preferential treatment. NumPy has such capabilities currently for the homogeneous type resolver, so changing to that fixes it (SciPy uses the above quite a lot). SciPy also has code like (in the tests): ``` np.nextafter(0, 1) ``` which will now FAIL. However, this is for the better since the above code is actually buggy: It will return different values on windows and 32bit linux because it will find the float32 instead of the float64 loop. That is almost certainly not the expected result.

seberg · 2021-04-30T19:23:53Z

I forgot to explain what this code "actually does" so to speak.

When the user specifies an exact DType for a loop, that dtype must be used and match exactly. So in that case, we do not need to check whether the (input) array can cast safely to pick the loop (the loop still has to match the casting safety, but that defaults to "same kind"!).

However, here I enforce that any operand that has not been specified must match "safely" (for inputs, for picking a loop, outputs really don't/shouldn't matter...).
This is slightly more strict, in practice, the current loop can pick the "single reasonable loop" so to speak (by picking the first, which leads to the above mentioned casting="unsafe" bug). In a world where we want users to add loops freely, that isn't really a good option though. Any such logic has to be handled in the "promotion" step, which is what I effectively do here for nextafter and divide.

seberg · 2021-04-30T19:32:33Z

Well, floor_divide doesn't have a ??->? loop or any ??->something loop at all, but works by using the bb->b loop (int8), I guess we need to just special case that?

seberg · 2021-04-30T20:01:52Z

A maybe noteworthy example, that hopefully clarifies the behaviour rather than confusing it:

np.add(arr*4, arr, dtype=np.float32) - np.add(arr*4, arr).astype(np.float32)
# array([0.000000e+00, 0.000000e+00, 9.536743e-07], dtype=float32)

is not 0 currently because the first version downcasts the inputs, while the second downcasts the output. (Generally, I think downcasting the output would be better).

This PR "refuses the temptation to guess" so to speak, i.e. if np.add was not a homogeneous operation an error would be raised, forcing the user to pass signature=(np.float32,) * 3 to ignore the input array dtypes. (But here, NumPy explicitly chooses to do the downcast when dtype=np.float32 is given.)

The other complexity is what dtype=type exactly means when compared with signature=(type, type, type). This was the change I did previously already, where before it usually meant (type, type, type) but often would choose (None, None, type) instead. While now it usually means (None, None, type) but "homogeneous" resolvers choose to do the same as for (type, type, type).

(My problem there is that I really don't want to things, so I have to make a choice into what to convert dtype= as a signature, and converting it to (type, type, type) always would be more diruptive than the alternative – and frankly less easy to understand in my opinion)

seberg · 2021-05-01T00:10:44Z

I suppose we could special case nextafter for integers to always go to doubles if anyone thinks it helps with transitions.

I just realized that int32 cannot cast safely to float32, so np.nextafter(1, 2) will always return the double version np.nextafter(np.int16(1), np.int16(2)) will currently go to float32...

seberg · 2021-05-01T00:56:05Z

I checked astropy and astropy.nddata.bitmask has two problems, but one is at the module level (trivial to fix and the fix is even nicer code, but still).

I have to think a bit more. But one solution could be to modify the meaning of signature=(None, None, dtype) as identical to dtype=dtype, BUT if the first sweep using signature=(None, None, dtype) fails, we then still always try also signature=(dtype, dtype, dtype). In the future certain ufuncs could opt out, but most of the time that should be unproblematic.

That might even allow to undo the nextafter and division changes although I think thats probably the right thing to do, maybe its easier to do the right thing in the future with new promoters...

EDIT: There are ufuncs (in scipy) that have signatures like dl->d and dd->d, for such functions we might pick the dd->d loop with dtype="d" and certain positions of the stars, when the dl->d loop could be slightly better (but we have no way of matching that one). That is probably for ufunc(float_arr, int_arr, dtype="d", casting="unsafe") the "less good" loop is chosen unless scipy adds a promoter (in the future) to customize the behaviour.

mhvk · 2021-05-01T11:49:39Z

@seberg - I read through it and I'll try to look a the code later. But to understand better, the current documentation for dtype reads

dtype

    New in version 1.6.

    Overrides the dtype of the calculation and output arrays. Similar to signature.

Which seems quite explicit, and in the direction of being equivalent to signature=(type, type, type) (though clearly inaccurate!). How should it read with your changes?

seberg · 2021-05-01T13:21:54Z

Thanks. In the signature normalization PR, I actually had modified it to signature=(None, None, dtype) except for certain ufuncs. Previously, it was "overrides the dtype of the calculation except for binary comparisons".

The old behaviour is roughly:

dtype=dtype behaves like signature=(dtype, dtype, dtype) except for binary comparisons.

Right now I made it:

dtype=dtype beahves like signature=(None, None, dtype) except for homogeneous signature ufuncs (those using the type-resolver), which make signature=(None, None, dtype) back into signature=(dtype, dtype, dtype)

So my last thought was, to try both signature=(None, None, dtype) but then also try signature=(dtype, dtype, dtype). So that dtype forces the output(s) dtype and also tries forcing the input dtypes if no loop for the input dtypes is found.

Maybe I thought about it wrong though. I prefer the "output" interpretation (and I think documentation or not, that is likely what most would guess).

If I revisit the previous change, the main issue is that binary comparisons behave slightly different with ufuncs. If we special case them (for a bit) signature=(dtype, dtype, dtype) could be kept around. And dtype=object would be fully deprecated for binary comparisons.

The one thing I don't really like would be to try to keep the special case that dtype=dtype cannot be normalized into signature=something.

EDIT: Sorry for being so confusing, the fact that the old code paths very quite a bit from ufunc to ufunc is confusing to me also. One further note is that a reason I liked the change to dtype meaning signature=(None, None, dtype) (outputs fixed), is that it doesn't collide with ufuncs like np.ldexp (and other scipy ones), that include type signatures like dl->d.

numpy/core/src/umath/ufunc_type_resolution.c

mhvk · 2021-05-01T15:43:55Z

OK, that helps at least get my thoughts a bit in order!

But let me try to think through a bit more, and try to see what strictly following the docstring mentioning that it is the output dtype and calculation might do. For comparisons, the dtype argument then can only be bool or object, where the latter means you get an array of python bools. I agree that then the loop logically has to be looked up on the inputs, and that the casting argument should not matter for your example of np.not_equal(None, False) - if one wanted to cast the inputs to bool, one would have to pass in signature='??->?, casting='unsafe'.

But with this in mind, I am confused by your example in #18880 (comment):

np.add(arr*4, arr, dtype=np.float32) - np.add(arr*4, arr).astype(np.float32)
# array([0.000000e+00, 0.000000e+00, 9.536743e-07], dtype=float32)

This seems expected: in one case the user explicitly asks for float32 output (and calculation), and given that there is no 'dd->f' I think that implies that the inputs need to be cast. I would tend not to try to fix that, mostly because I think the opposite case is more important. For instance, if a user does:

np.multiply(a, b, dtype=float64)

I think the intent would be clear: I want my calculation done in float64 even if a and b are both float32 (maybe because I expect infinities otherwise).

p.s. I think np.nextafter is a special case: essentially, one should allow unsafe casting of the second argument to that of the first, and the output dtype has to be equal to the input dtype. Definitely not worth changing the regular rules for!

seberg · 2021-05-01T15:59:55Z

@mhvk Yeah, I have to think about it more and maybe map out the space of possibilities a bit clearer. One thing about the add example, I am considering a situation where we have additional loops:

ff->f
dd->f
dd->d
ff->d

If the user says dtype=f we could pick the dd->f loop or the ff->f loop depending on the input dtypes in theory. It is trickier in practice, since you need to define a clear rule. Which probably could be either: Use the common dtype for all unspecified dtypes OR use the specified dtype for all loop dtypes (or both in a given order, prefering either ff->f or dd->f when the input is df->f! with the exclanation mark meaning that the last f is truly fixed).
EDIT: This rule only needs to be applied if the inputs are not dd or ff already of course.

In either case the np.multiply(a, b, dtype=float64) example can never choose the float32 loop, because the output dtype must strictly be float64 (unless you also provide out=float32_arr).
It could choose the hypothetical ff->d loop, but if someone adds such a loop they should guarantee that the operation happens in float64.

(Not sure it helps, but the main issue is about cases where it seems like it should be obvious which loop to choose, but I am not sure how to choose it without using the fact that loops are currently ordered.)

If only output dtypes are provided, we add the additional step in the ufunc type resolver of converting a signature of the format:: (None,) * nin + (DType,) * nout into:: (DType,) * (nin + nout) And try searching for a matching loop again. The advantage of this is that IF the ufunc has mixed-type inputs, we have a fighting chance of finding a good loop for `dtype=DType`. Further, future `promoters` could refine such logic, while at the same time, we do not need to use completely different code paths for `signature=...` and `dtype=...`. The rare downside is that we might pick a `dd->d` loop when a `dl->d` loop would do (SciPy has such ufuncs). The up-side is, that before, using `dtype=...` was not even possible for functions with non-homogenous loops like `np.ldexp`. The future will hopefully see more precise promotion logic, and some of the more surprising paths simply deprecated...

seberg · 2021-05-03T18:35:15Z

OK, I think I mostly converted to the solution of modifying the meaning of dtype=DType, but falling back to the old behaviour. A bit because it is useful sometimes and mostly for backward compatibility.

That means that if only output dtypes are provided in the signature, we add the additional step in the ufunc type resolver of converting a signature of the format::

(None,) * nin + (DType,) * nout

into::

(DType,) * (nin + nout)

And try searching for a matching loop again. This falls back to the old version, unless it finds a loop, which presumably should return a correct result. (After all all inputs can safely cast to the loop.)

The main advantage of this is that if the ufunc has mixed-type inputs, we have a fighting chance of finding a good loop for dtype=DType. And the way this is done could be customized in the future.

This is the contrast with the main alternative, which is translating dtype=DType directly into signature=(DType,) * (nin+nout). If we do that, the dtype argument is completely useless for ufuncs without homogeneous type signatures. (This is fairly rare, e.g. comparisons have it, but are special. Division has it for timedelta, although timedelta is a bit special, and np.ldexp and some functions have integer arguments mixed with floats).

There are some disadvantages, of course:

The loop that gets picked becomes less obvious, since we might pick up the homogeneous loop when a different loop arguably "fits better"
In the future, this logic will have to be duplicated by the promoters to the ufunc. Those can also refine the behaviour, but likely will need to deprecate other parts.
It slightly the trust we put into "additional loops". We trust that nobody will add an ff->d loop, that does the computation in single precision. (An dd->f loop that does the computatoin in double and then downcasts is interesting as it would improve precision.). I think this is a gamble I am willing to make, quite frankly less-precise loops are bound to crop up if we make it easier to replace/augment NumPy loops (e.g. fastmath flags). Hopefully, we could channel such problems into new ufuncs rather than into "augmentation" of existing ones...

I can be convinced of going the signature=(DType,) * ( nin+nout) route, but once we are there, it might be fairly tricky to relax things again.

Also, for now enforce *all* outputs. That is more conservative, and obviously won't work for certain functions in scipy that have multiple outputs of different types, but it seems a bit safer to me right now. Maybe at some point we have an idea of how to "modify" what the `dtype` means. I am not sure how! By allowing the ufunc itself to interpret how to translte it to a `signature`? By giving it a distinct meaning? Yes, this might narrow us down a bit... but...

seberg · 2021-05-03T21:05:14Z

I am sorry about adding a "slight" new change. I did now enforce all outputs for dtype=DType (in the case of multiple output arrays).

In theory, SciPy has some ufuncs which have multiple outputs of different DTypes in scipy.special, so those functions would still be blocked from using dtype= for anything useful. But this seemed the most reasonable trade-off to me right now, unless we had some idea how to decouple dtype from signature entirely, and I don't have such an idea.

rossbar

Having read through the discussion, I went through the changes to see how well I understood the intent and to what degree the documentation updates made that intent clear. I had a couple of questions/suggestions for sentences that could be polished, but generally I think the documentation describing the change is looks good!

rossbar · 2021-05-03T23:20:28Z

doc/release/upcoming_changes/18880.compatibility.rst

+Since NumPy special cases if only outputs (or ``dtype``) is provided,
+this should affect very few users.


I interpreted this to mean "The behavior change only affects partial signatures where at least one input type is provided, so the change should affect very few users". I'm not sure that my interpretation is correct though.

Yeah, that is what I meant, and I think it is probably true :). It definitely is the mental model I want users to use for now (which will immediately tell them, that practically nobody will notice).

👍 in that case it might be worth rewording this sentence since IMO it requires a bit of extra context to correctly interpret the "special casing" here. It could just be me though :)

doc/source/reference/ufuncs.rst

rossbar · 2021-05-03T23:52:45Z

numpy/core/src/umath/ufunc_type_resolution.c

@@ -553,6 +553,11 @@ PyUFunc_SimpleUniformOperationTypeResolver(
         * This is a fast-path, since all descriptors will be identical, mainly
         * when only a single descriptor was passed (which would set the out
         * one in the tuple), there is no need to check all loops.
+         * Note that this also allows (None, None, float64) to resolve to
+         * (float64, float64, float64), even when the inputs do not match,
+         * i.e. fixing a single part of the signature can fix all of them.


Should this say "i.e. fixing the output of the signature can determine the dtype(s) for the rest of the signature"? Not that this would ever happen, but the current wording makes it seem like (None, float64, None) would also fall back to (float64, float64, float64).

Yeah, you are right here. For the other changes, this is not what happens, but for the simpleuniform one, that is actually what I coded... I don't mnind that behaviour, but there is also no real reason for it. Let me modify the code to align with the rest and only apply the "uniform" expansion to outputs.

Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>

[ci-skip]

mhvk

I looked again and think the code and description are clear now. Only some totally nitpicky comments about docstrings and code comments.

mhvk · 2021-05-06T00:12:17Z

doc/release/upcoming_changes/18880.compatibility.rst

+to no loop being found in some cases, NumPy will normally also search
+for the loop::
+
+    signature("float64", "float64, "float64")


Missing end double quote on second item.

mhvk · 2021-05-06T00:57:08Z

doc/source/reference/ufuncs.rst

+
+    This argument allows the user to specify exact DTypes to be used for the
+    calculation.  Casting will be used as necessary. The input DType
+    is not considered unless ``signature`` is ``None`` for that input.


Maybe "The actual DType of the input arguments is not considered ..." (otherwise, it is a bit unclear what "input DType" means)

Adopted (replaced "argument" with "array" also).

mhvk · 2021-05-06T00:58:12Z

doc/source/reference/ufuncs.rst

+    operation.  It does not specify the ``datetime64`` time-unit or the
+    ``float64`` byte-order.
+
+    For backwards compatibility this argument can also be provided as *sig*,


Out of scope here, but time to remove sig, I'd think!

Unfortunately, we never deprecated it with a warning. But maybe that is about time...

mhvk · 2021-05-06T01:01:55Z

numpy/core/src/umath/ufunc_type_resolution.c

-                            out_dtype, types, NULL);
+                            out_dtype, orig_types, NULL);
+                        /*
+                         * In principle, we only need to validate the


Does this mean that in future we could simplify this? If so maybe add "TODO" in front?

My TODO here is "replace practically everything in this file and then delete it". It is a bit unfortunate, but to get there, I have to fix it first...

😄 Yes, better to evolve using small steps...

mhvk · 2021-05-06T01:03:00Z

numpy/core/src/umath/ufunc_type_resolution.c

+            /* Error */
+            case -1:
+                return -1;
+                /* Found matching loop */


Align same as /* Error */

mhvk · 2021-05-06T01:04:09Z

numpy/core/src/umath/ufunc_type_resolution.c

+    for (i = 0; i < self->ntypes; ++i) {
+        char *orig_types = self->types + i*self->nargs;
+
+        /* Check specified types and copy into an int array for matching */


Maybe note this code appears twice and could be put into a small function?

Will add a note that its duplicated. Its not quite trivial to refactor because orig_types is char * in one case and int *.

mhvk · 2021-05-06T01:05:02Z

numpy/core/src/umath/ufunc_type_resolution.c

+                    types[j] = orig_types[j];
+                }
+                set_ufunc_loop_data_types(self, op, out_dtype, types, NULL);
+                /* In principle, we only need to validate the NPY_NOTYPE ones */


As above, is this meant to be a TODO?

mattip · 2021-05-06T18:20:52Z

doc/release/upcoming_changes/18880.compatibility.rst

+The behaviour for ``signature=...`` and ``dtype=...`` can differ in
+some cases to the previous behaviour.


Suggested change

The behaviour for ``signature=...`` and ``dtype=...`` can differ in

some cases to the previous behaviour.

The behaviour for ``np.ufunc(1.0, 1.0, signature=...)`` and ``np.ufunc(1.0, 1.0, dtype=...)``

can yield different loops in 1.21 than in 1.20 because of changes in promotion.

Adopted as:

The behaviour for np.ufunc(1.0, 1.0, signature=...) or
np.ufunc(1.0, 1.0, dtype=...) can now yield different loops in 1.21 compared
to 1.20 because of changes in promotion.

mattip · 2021-05-06T20:02:31Z

doc/release/upcoming_changes/18880.compatibility.rst

+In that case, it is necessary to provide the complete signature
+to enforce casting the inputs.
+If ``dtype="float64"`` is used or only outputs are set (e.g.
+``signature=(None, None, "float64")`` the behaviour should remain


Suggested change

``signature=(None, None, "float64")`` the behaviour should remain

``signature=(None, None, "float64")`` the is

adopted, thanks

mattip

A few nits, but otherwise LGTM.

Co-authored-by: Matti Picus <matti.picus@gmail.com>

charris · 2021-05-07T17:32:32Z

close/reopen

seberg · 2021-05-13T20:39:20Z

Can we put this in so that I can do the ugly deed of squash merging gh-18905?

mattip · 2021-05-14T10:12:08Z

Thanks @seberg

github-actions bot added the 30 - API label Apr 30, 2021

seberg added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Apr 30, 2021

seberg added this to the 1.21.0 release milestone Apr 30, 2021

BUG: Hardcode that boolean division is promoted to int8 division

427548c

seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Apr 30, 2021

DOC: Fix nextafter docs since example used integers

006c199

mhvk reviewed May 1, 2021

View reviewed changes

numpy/core/src/umath/ufunc_type_resolution.c Show resolved Hide resolved

seberg removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label May 3, 2021

rossbar self-requested a review May 3, 2021 21:53

rossbar added 3 commits May 3, 2021 16:35

Grammatical fixups for release note.

62e1c86

Grammatical fixups for ufuncs refguide.

2e38b8f

Comment fixups

d476630

rossbar reviewed May 4, 2021

View reviewed changes

seberg and others added 3 commits May 5, 2021 10:11

Update doc/source/reference/ufuncs.rst

e684d66

Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>

Limit special handling to output dtypes in uniform resolver

6786740

DOC: Try to clarify release notes a bit

8eb8360

[ci-skip]

mattip added this to In Progress in Ufunc refactor May 5, 2021

mhvk reviewed May 6, 2021

View reviewed changes

DOC: Updates based on Marten's review

3c4e5af

mattip reviewed May 6, 2021

View reviewed changes

mattip moved this from In Progress to Ready for final review in Ufunc refactor May 6, 2021

mattip reviewed May 6, 2021

View reviewed changes

DOC: Update release notes based on Matti's review/comments

c2f2fa0

Co-authored-by: Matti Picus <matti.picus@gmail.com>

charris closed this May 7, 2021

charris reopened this May 7, 2021

seberg requested a review from mattip May 9, 2021 18:18

mattip merged commit 5eb5802 into numpy:main May 14, 2021

seberg deleted the make-signature-resolve-more-strict-and-fix-nextafter branch May 14, 2021 14:32

mattip removed this from Ready for final review in Ufunc refactor May 20, 2021

		Since NumPy special cases if only outputs (or ``dtype``) is provided,
		this should affect very few users.

		The behaviour for ``signature=...`` and ``dtype=...`` can differ in
		some cases to the previous behaviour.

	``signature=(None, None, "float64")`` the behaviour should remain
	``signature=(None, None, "float64")`` the is

API: Ensure that casting does not affect ufunc loop #18880

API: Ensure that casting does not affect ufunc loop #18880

Conversation

seberg commented Apr 30, 2021 • edited

seberg commented Apr 30, 2021

seberg commented Apr 30, 2021

seberg commented Apr 30, 2021

seberg commented May 1, 2021

seberg commented May 1, 2021 • edited

mhvk commented May 1, 2021

seberg commented May 1, 2021 • edited

mhvk commented May 1, 2021

seberg commented May 1, 2021 • edited

seberg commented May 3, 2021

seberg commented May 3, 2021

rossbar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhvk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattip left a comment

Choose a reason for hiding this comment

charris commented May 7, 2021

seberg commented May 13, 2021

mattip commented May 14, 2021

seberg commented Apr 30, 2021 •

edited

seberg commented May 1, 2021 •

edited

seberg commented May 1, 2021 •

edited

seberg commented May 1, 2021 •

edited