Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add floating point error checking to (almost) all casts #21437

Merged
merged 20 commits into from Jun 14, 2022

Conversation

seberg
Copy link
Member

@seberg seberg commented May 3, 2022

This, unfortunately larger, PR tries to add floating point error checking to all places where this is necessary. Unfortunately, there are a few places, the most complicated one being the advanced indexing (simply because there are many branches – note that only assignments are truly relevant).

Doing this also required threading it into the NpyIter, the way this is used from elsewhere is a cludge (due to being non-public API).

There are a few smaller problems (most of which we can hopefully ignore):

  • I did not fix ufunc.at, but that has its own problems anyway, so I am not too concerned about it.
  • The masked array code (due to the fill value) runs into these warnings now. It is easy enough to ignore it/silence it in the tests, but it is a bit unclear what should happen.
  • Probably something else I am missing

There is still some cleanup and maybe simplification necessary, so marking as draft. But this will close a few open issues, and it ensures the warning for np.float32(2e200).
So, I think it is both a pretty worthwhile thing, and is one of those warnings that I would like to have available when we/if we switch to "weak Python scalar" promotion (remove value based casting).

@seberg
Copy link
Member Author

seberg commented May 4, 2022

Hmm, looking at the failures:

  1. Travis failure seems like a gcc bug on power64le (I think I can reproduce it in godbolt). It generates an xvcmpgedp instruction. The "ge" in there stands for greater-equal. I could not find it, but a "ge" instruction should set the "invalid" warning flag. Replacing the code with != seems to fix it, we may want to do that platform specific. != doesn't seem good enough, this needs some other fix to get proper unordered/quiet comparisons.
  2. The PyPy issue seems OK, it looks like a variant of
    BUG: ufuncs that call into Python do not ignore FPEs e.g. for inf - inf #21416 and should probably just be ignored. (I did not look into whether there is also something slightly not-ideal going on in PyPy.)
  3. The last failure puzzles me right now. It seems like a serious issue somewhere since the tuple seems to be converted to an empty array, which is complete nonsense. valgrind did not point to anything strange close to these tests though. So for now, I am unsure what to look for. A reference counting issue, with needs_api being set incorrectly. A bit crazy that it did not crash far more things (although refcount checker would have shouted...).

EDIT: OK, So about 3, valgrind actually does point to an incorrectly free'd tuple (or a double free even), which would add up... Unfortunately, it only shows at interpreter shutdonw.

@seberg seberg force-pushed the fpe-in-cast branch 2 times, most recently from 0f48e16 to a6c07a7 Compare May 4, 2022 18:18
@seberg seberg force-pushed the fpe-in-cast branch 3 times, most recently from 63dd8c5 to c640bba Compare June 2, 2022 14:55
@seberg seberg marked this pull request as ready for review June 2, 2022 17:19
@seberg seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 2, 2022
@seberg
Copy link
Member Author

seberg commented Jun 2, 2022

This should be ready for review now. I think it maybe should have a release not (so tagging). There are still one or two error paths that I will try to cover in tests, but overall I think I got it.

There are some tricky parts here. The first five commits may be good to go through on their own. But overall I will be happy to look at the changes together. The by far worst part I think is the advanced indexing...

EDIT: Hmmm, surprising the test fails only on windows 32bits, but seems that this may not be a fluke but a real refcounting (or cleanup timing/race condition?) issue somewhere. (Shouldn't matter too much for a first review though.)

@seberg seberg removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 6, 2022
@mattip
Copy link
Member

mattip commented Jun 12, 2022

The code itself is a lot of changes to pass the resultant error state around and looks fine. I think it would be worthwhile to put this in now. Do you know if it breaks any scipy/astropy tests?

@seberg
Copy link
Member Author

seberg commented Jun 12, 2022

Thanks for the review and slashing down the release note. I would like a note somewhere that float to int casts may return undefined values, but its not really right in this note (it is unchanged anyway).

I have not tested downstream test-suites (I may look into that tomorrow or so). OTOH, I don't expect larger problems, and they would be fixable with np.errstate(invalid="ignore") in general. The one annoying point (IMO) is the combination of MaskedArray with float16, that might require fixups (but is also exceedingly rare probably).

seberg added 10 commits June 13, 2022 09:36
To achieve this, I first need to propagate the ArrayMethod information
that FPE flags _may_ be given for a specific cast.
This requires quite a bit of changes (which is arguably the more annoying
part of the PR/commit).
Note that for some "legacy" casts, we do not modify this and assume that
the only interesting information is `needs_api`.
unfortunately, I had to realize that float -> integer casts are not
well defined when values are out of range for the integer.
This is a problem with C, but means that neither the warnings seem
to be particularly well defined...
(I suspect, on my CPU it either warns OR gives "valid" integer overflow
results, but the question is if that is even universally true...)
The test meant to assert shapes, but didn't actually do it.
The remaining uncovered paths seem to me like code that should be
effectively unreachable.
(I do actually wonder if `PyArray_MapIterReset` should be removed.)
@seberg
Copy link
Member Author

seberg commented Jun 13, 2022

Just ran the SciPy 1.8.1 test-suite on this. It has 16 (related) failures, out of which 15 are all in the same test_distances file and seem to fail for the same integer_arr[something] = integer_arr[something] * np.nan assignment. So I suspect they should just deal with integer arrays better (especially since that assignment may be undefined behavior for some integer dtypes!).

EDIT: There are likely a lot more warnings elsewhere which are just not floated for a released version of SciPy!

@mattip
Copy link
Member

mattip commented Jun 13, 2022

I am a little hesitant to merge a PR that we know for sure will break SciPy tests. Could you open an issue that refers to this PR?

@seberg
Copy link
Member Author

seberg commented Jun 13, 2022

Not sure if a NumPy or SciPy issue. Happy to followup with either quickly though. I tried checking astropy. Not that the run went perfectly (too old or new pytest version), but it does not look like astropy is affected.

@mattip mattip merged commit abc26a3 into numpy:main Jun 14, 2022
@mattip
Copy link
Member

mattip commented Jun 14, 2022

Thanks @seberg

@seberg seberg deleted the fpe-in-cast branch June 14, 2022 13:23
matthew-brett added a commit to nipy/nibabel that referenced this pull request Jun 18, 2022
TEST: Suppress new numpy warning on nan-to-int cast

numpy/numpy#21437 does not require us to change our behavior, but does require us to change our expectations of what warnings are raised. Since this is quite an edge case, it seems to make more sense to suppress the warning and eventually expect it when numpy 1.24 becomes our minimum.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants