Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: mysterious underflow/precision behavior, cannot create MRE due to bug? #23003

Closed
vsbuffalo opened this issue Jan 12, 2023 · 10 comments
Closed
Labels
33 - Question Question about NumPy usage or development

Comments

@vsbuffalo
Copy link

Describe the issue:

This may not be a bug, but the behavior is mysterious and unclear to me. Part of the problem is that in trying to recreate an MRE I cannot.

The setup is I have two arrays I'm multiplying, one with very small numbers. I use pdb to drop into interactive mode at the exception. Poking around a bit, the error is happening at the multiplication step, and I can recreate it by taking the smallest non-zero value and multiplying it by the second array, e.g.

Pdb) x[x>0].min() * W
*** FloatingPointError: underflow encountered in multiply

## or 

(Pdb) x[x>0].min() * 1e-10
*** FloatingPointError: underflow encountered in double_scalars

Ok, the behavior I would like is for this to just go to zero — I don't care about the loss of precision. But then I try to create an MRE to experiment with solutions. What is the minimum?

(Pdb) y =  x[x>0]
(Pdb) y.min()
9.117510882823816e-308

Then I try in a new Python session,

>>> import numpy as np
>>> np.array([9.117510882823816e-308], dtype='float64')
array([9.11751088e-308])
>>> f = np.array([9.117510882823816e-308], dtype='float64')
>>> f * 1e-10
array([9.11751e-318])
# No error

I then think that this is because of the loss of precision in printing the float. I wasn't sure how to handle this, but I tried np.set_printoptions(precision=1000) but this seems to only impact how arrays are printed, not the np.float64 values as they're pulled out. So I next thought I can work with the raw data.

(Pdb) y.min().tobytes()
b'\xe4\x94\x10\xf0\xf6c0\x00'

Then in the second session,

$ python
Python 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:42:07)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.frombuffer(b'\xe4\x94\x10\xf0\xf6c0\x00', dtype='float64')
array([9.11751088e-308])
>>> f = np.frombuffer(b'\xe4\x94\x10\xf0\xf6c0\x00', dtype='float64')
>>> f * 1e-10
array([9.11751e-318])
# No error again! 

I find it odd that I can't create an MRE even with the buffer. What could be going on here?

I also tried using format strings,

(Pdb) f"{y:.10000000}"
'9.117510882823815665548977178863615628591324749590460596300089284407848522066006531932055383210337983249882939129197635693204683436837619342899485403616406753151023440802800570432302810469374755878262353843400205063415657530940618334968743282787718401379326462405248209351474798409859277445681925875157863523372638365924454046245513719836560774569949082153016343041264284375069892098945977461004851844859972461215974549935010425299501133344015649450602626874461701983236523291202172361897246010705343303184104772030271936525192961955598148050240545429201711268321687328347974353590798632145138753708909956222270239479037724670024066080647006736702136140242161352259516590652676898809691225616065391385488011421236917205938345143412249171888106502592563629150390625e-308'

Then, in the interactive python session I pasted this in

>>> ff = np.array([9.117510882823815665548977178863615628591324749590460596300089284407848522066006531932055383210337983249882939129197635693204683436837619342899485403616406753151023440802800570432302810469374755878262353843400205063415657530940618334968743282787718401379326462405248209351474798409859277445681925875157863523372638365924454046245513719836560774569949082153016343041264284375069892098945977461004851844859972461215974549935010425299501133344015649450602626874461701983236523291202172361897246010705343303184104772030271936525192961955598148050240545429201711268321687328347974353590798632145138753708909956222270239479037724670024066080647006736702136140242161352259516590652676898809691225616065391385488011421236917205938345143412249171888106502592563629150390625e-308], dtype='float64')
>>> ff
array([9.11751088e-308])
>>> ff * 1e-10
array([9.11751e-318])
# still no error!

So I'm at a loss as to how to create an MRE of this underflow (and how to get the behavior I see and want in the interactive session, that is, it just goes to zero). Apologies if I'm missing something.

Reproduce the code example:

# create the MRE is part of the bug, see above

Error message:

No response

Runtime information:

Pdb session:

(Pdb) import sys, numpy; print(numpy.__version__); print(sys.version)
1.24.1
3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:42:07)
[GCC 9.4.0]

Second session:

>>> import sys, numpy; print(numpy.__version__); print(sys.version)
1.24.1
3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:42:07)
[GCC 9.4.0]

Context for the issue:

This is important because I cannot create a working MRE due to the bug. The behavior is different even though I used the raw bytes to try to recreate the example. It would be helpful too if there was documentation on how to create a proper MRE given precision/underflow issues.

@charris
Copy link
Member

charris commented Jan 12, 2023

What platform are you on, and where did you get NumPy. Note that the underflow behavior can depend on the compiler and the compiler flags.

@vsbuffalo
Copy link
Author

I'm on Mac OS 10.14.6 (18G9323) and I upgraded NumPy when testing this with conda/mamba, e.g. mamba update numpy. Numpy is from conda-forge.

@seberg
Copy link
Member

seberg commented Jan 12, 2023

What is the result when you are getting the error? Is it 0? By default computers usually use subnormal numbers to fill the gap between np.finfo(np.float64).min and 0. Those numbers are often slow, and have less precision.

Now what happens sometimes is that some libs who are not numpy change this globally also for NumPy (which is much easier, and changing things also isn't mega fast).
That is, they may enable FTZ "flush to zero" behavior (there are different flavors) and the result would be exactly 0 (opencv probably does, maybe pytorch too, don't remember).

Now, in that case I guess we may get an "underflow". I am then suprised that you see an error though, because by default NumPy ignores underflows. You would have to also do np.errstate(under="raise")!

@rkern
Copy link
Member

rkern commented Jan 12, 2023

It's possible that something in the full code is in fact doing np.seterr(under='raise'), and that would explain why he's not seeing a FloatingPointError when he tries to reduce the code down to a minimal example.

@rkern
Copy link
Member

rkern commented Jan 12, 2023

>>> import numpy as np
>>> np.seterr(under='raise')
{'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'}
>>> f = np.array([9.117510882823816e-308], dtype='float64')
>>> f * 1e-10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FloatingPointError: underflow encountered in multiply

@seberg
Copy link
Member

seberg commented Jan 12, 2023

Sorry, yeah... no need for FTZ mode or so to get that warning/error. I must have typoed something when I was aiming to try np.seterr(under="warn") and test the underflow reproducers. EDIT: Ops, use errstate of course :)!

@rkern
Copy link
Member

rkern commented Jan 12, 2023

Basically, to explain the behavior, an underflow happens when the result is below 2.2250738585072014e-308, which is the smallest number that can be represented as a normal float64 number (np.finfo(np.float64).tiny). By default, we ignore underflows that are reported by the FPU since they are usually harmless in our contexts. This setting is controlled by np.errstate() (for the well-behaved context-manager that only changes the setting within its context) or np.seterr() (an older API for brute-force manipulation of the global settings). In the context of your debugging, you are either within an np.errstate(under='raise') context manager, or some other library has annoyingly changed the global settings. When you tried to reproduce with a minimal example, the reason you didn't get the same exception is very likely because you didn't also change the underflow error handling in the same way the full code does.

I don't think there is a bug here.

@rkern rkern added 33 - Question Question about NumPy usage or development and removed 00 - Bug labels Jan 12, 2023
@seberg
Copy link
Member

seberg commented Jan 13, 2023

It seems unlikely to be a bug. Although there is a small possibly that this is duplicate of gh-9444: np.errstate() is not asyncio safe, so in theory bad state could happen.

That said, it seems more likely that something uses seterr(all="raise") and fails to clean up correctly (should use with np.errsate(...):)

@seberg
Copy link
Member

seberg commented Jan 16, 2023

Closing, as there is no followup and we have an issue open for the bug (if it is one). I would like to hear though if you find out that this was a problem with np.errstate() being properly used but not doing the right thing, @vsbuffalo. Would help prioritizing that issue.

@seberg seberg closed this as completed Jan 16, 2023
@vsbuffalo
Copy link
Author

Thank you all for your help and replies! I agree that it seems likely that some other package has tinkered with the error settings, and it is okay to close this issue. Thank you for making numpy the amazing system that it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
33 - Question Question about NumPy usage or development
Projects
None yet
Development

No branches or pull requests

4 participants