-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: special.rel_entr: avoid premature overflow #20710
Comments
It is clearly desirable to avoid premature overflow. If x and y are greater than one, taking the log and then subtracting might reduce the accuracy, as the domain of logarithm is “larger” than its image. On the other hand, accuracy is not compromised if the arguments are less than one, which may be typical if rel_entr is applied to two probability values. |
I checked this idea experimentally, and in general, the division approach has better accuracy than the logarithm approach. With three exceptions, the The three exceptions:
Other than these three exceptions, the division approach is more accurate. Here's a comparison of the error of the two methods: (Note: Errors have been clipped at 95% to make the graphs readable. Some cells have thousands of ULP of error. See full notebook for details.) With that in mind, here is the logic I would propose:
The If you look only at the area where x < 1 and y < 1, which is the bottom-left quadrant of the graph, then the |
Very interesting analysis @nickodell ! I suspect the "clipped" values with large error could be related to To illustrate this, I tried a contrived example: import numpy as np
from mpmath import mp
mp.dps = 1000
x = 1234
y = np.nextafter(x, 100)
print(f'{np.log(y/x)=}')
print(f'{np.log(y) - np.log(x)=}')
print(f'{float(mp.log(y)-mp.log(x))=}')
print(f'{float(mp.log(mp.mpf(y)/x))=}')
print(f'{np.log1p((y-x)/x)=}') Output:
So for those two numbers that are really close, The contrived example shows that if we insist on the accuracy from a numerical perspective, there's a fair bit of work to do. |
That's a really smart idea. Here's another graph of function error for the log1p method. (I used So I could piece together a more accurate |
Describe your issue.
Looks like there's opportunity for improvement in
scipy.special.rel_entr
. Presumably, it takes the ratiox/y
before taking the logarithm because it's faster than taking the difference of two logs, but this can overflow when the correct result is finite. Perhaps a compromise is to check for the overflow and, in that case, take the time to compute the two logs separately.This came up gh-20673 as a Hypothesis counterexample rather than a real use case, which is why I consider it an enhancement request.
Reproducing Code Example
Error message
SciPy/NumPy/Python version and system information
The text was updated successfully, but these errors were encountered: