Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: special.rel_entr: avoid premature overflow #20710

Closed
mdhaber opened this issue May 14, 2024 · 4 comments · Fixed by #20816
Closed

ENH: special.rel_entr: avoid premature overflow #20710

mdhaber opened this issue May 14, 2024 · 4 comments · Fixed by #20816
Labels
enhancement A new feature or improvement scipy.special
Milestone

Comments

@mdhaber
Copy link
Contributor

mdhaber commented May 14, 2024

Describe your issue.

Looks like there's opportunity for improvement in scipy.special.rel_entr. Presumably, it takes the ratio x/y before taking the logarithm because it's faster than taking the difference of two logs, but this can overflow when the correct result is finite. Perhaps a compromise is to check for the overflow and, in that case, take the time to compute the two logs separately.

This came up gh-20673 as a Hypothesis counterexample rather than a real use case, which is why I consider it an enhancement request.

Reproducing Code Example

import numpy as np
from scipy import special
x = np.asarray([2., 3., 4.])
y = np.asarray(np.finfo(np.float64).tiny)
special.rel_entr(x, y)  # array([1418.17913143, 2128.48509246,           inf])
x * (np.log(x) - np.log(y))  # array([1418.17913143, 2128.48509246, 2839.13085157])

Error message

-

SciPy/NumPy/Python version and system information

main
@mdhaber mdhaber added enhancement A new feature or improvement scipy.special labels May 14, 2024
@fancidev
Copy link
Contributor

fancidev commented May 25, 2024

It is clearly desirable to avoid premature overflow.

If x and y are greater than one, taking the log and then subtracting might reduce the accuracy, as the domain of logarithm is “larger” than its image. On the other hand, accuracy is not compromised if the arguments are less than one, which may be typical if rel_entr is applied to two probability values.

@nickodell
Copy link
Contributor

@fancidev

If x and y are greater than one, taking the log and then subtracting might reduce the accuracy, as the domain of logarithm is “larger” than its image. On the other hand, accuracy is not compromised if the arguments are less than one, which may be typical if rel_entr is applied to two probability values.

I checked this idea experimentally, and in general, the division approach has better accuracy than the logarithm approach. With three exceptions, the x * log(x / y) approach has equal or better accuracy than the x * (log(x) - log(y)) approach.

The three exceptions:

  1. Overflow. If x / y == inf, then log subtraction is more accurate. (This is the lower-right white area of the graph below - here error is infinite for the ratio approach.)
  2. Underflow. If x / y == 0, then log subtraction is more accurate. (This is the upper-left white area of the graph below - here error is infinite for the ratio approach.)
  3. Subnormal numbers. If x / y is a subnormal number, then log subtraction is more accurate. (This is the diagonal band of yellow squares above the main diagonal. This has finite error, but is still quite inaccurate.)

Other than these three exceptions, the division approach is more accurate.

Here's a comparison of the error of the two methods:

ratio

logdiff

(Note: Errors have been clipped at 95% to make the graphs readable. Some cells have thousands of ULP of error. See full notebook for details.)

With that in mind, here is the logic I would propose:

# assume all of the corner cases have already been handled
ratio = x / y
if ratio == inf or ratio < np.finfo(np.float64).tiny:
    return x * (log(x) - log(y))
else:
    return x * log(ratio)

The ratio == inf check handles case 1. The ratio < np.finfo(np.float64).tiny check handles cases 2 and 3.

If you look only at the area where x < 1 and y < 1, which is the bottom-left quadrant of the graph, then the x * log(x / y) approach still has equal or better accuracy than the x * (log(x) - log(y)) approach.

Notebook source

@fancidev
Copy link
Contributor

Very interesting analysis @nickodell ! I suspect the "clipped" values with large error could be related to x and y that are close.

To illustrate this, I tried a contrived example:

import numpy as np
from mpmath import mp
mp.dps = 1000
x = 1234
y = np.nextafter(x, 100)
print(f'{np.log(y/x)=}')
print(f'{np.log(y) - np.log(x)=}')
print(f'{float(mp.log(y)-mp.log(x))=}')
print(f'{float(mp.log(mp.mpf(y)/x))=}')
print(f'{np.log1p((y-x)/x)=}')

Output:

np.log(y/x)=-2.2204460492503136e-16
np.log(y) - np.log(x)=-8.881784197001252e-16
float(mp.log(y)-mp.log(x))=-1.842574355293615e-16
float(mp.log(mp.mpf(y)/x))=-1.842574355293615e-16
np.log1p((y-x)/x)=-1.842574355293615e-16

So for those two numbers that are really close, log(y/x) is off by 20%, log(y)-log(x) is off by 380%, and log1p gives the correct answer.

The contrived example shows that if we insist on the accuracy from a numerical perspective, there's a fair bit of work to do.

@nickodell
Copy link
Contributor

That's a really smart idea. Here's another graph of function error for the log1p method. (I used np.log1p((x - y) / y) rather than np.log1p((y-x)/x), though.)

Untitled

So I could piece together a more accurate rel_entr() from three sources: log1p for the diagonal, where x and y are very close, ratio for everywhere else, and log difference for places where there is under/overflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.special
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants