Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: int by numpy.int64 division incorrect for 2**63 < dividend < 2**64 #22457

Closed
ceu1e opened this issue Oct 20, 2022 · 7 comments
Closed

BUG: int by numpy.int64 division incorrect for 2**63 < dividend < 2**64 #22457

ceu1e opened this issue Oct 20, 2022 · 7 comments
Labels
15 - Discussion 57 - Close? Issues which may be closable unless discussion continued

Comments

@ceu1e
Copy link

ceu1e commented Oct 20, 2022

Describe the issue:

Dividing (e.g.) (2**64-1) // 3 works as expected if 3 is an int, but produces a float result if 3 is a numpy.int64.
Consequently, the corresponding modulus calculation produces an incorrect result (also float).

This seems to occur just for dividends in the range of 2**63 to 2**64.

Division of int by numpy.int64 should return the correct integer result for the full range of dividend values.

Reproduce the code example:

import numpy

py3 = 3
npy3 = numpy.int64(3)
n1 = 2**64-1
n2 = 2**63+1
n3 = 2**64+2
n4 = 2**63-2
n5 = 2**128-1
n6 = 2**128+2
print(f"n1 // py3 = {n1 // py3}, n1 % py3 = {n1 % py3}, n1 // npy3 = {n1 // npy3}, n1 % npy3 = {n1 % npy3}")
print(f"n2 // py3 = {n2 // py3}, n2 % py3 = {n2 % py3}, n2 // npy3 = {n2 // npy3}, n2 % npy3 = {n2 % npy3}")
print(f"n3 // py3 = {n3 // py3}, n3 % py3 = {n3 % py3}, n3 // npy3 = {n3 // npy3}, n3 % npy3 = {n3 % npy3}")
print(f"n4 // py3 = {n4 // py3}, n4 % py3 = {n4 % py3}, n4 // npy3 = {n4 // npy3}, n4 % npy3 = {n4 % npy3}")
print(f"n5 // py3 = {n5 // py3}, n5 % py3 = {n5 % py3}, n5 // npy3 = {n5 // npy3}, n5 % npy3 = {n5 % npy3}")
print(f"n6 // py3 = {n6 // py3}, n6 % py3 = {n6 % py3}, n6 // npy3 = {n6 // npy3}, n6 % npy3 = {n6 % npy3}")

Error message:

No response

NumPy/Python version information:

1.23.4 3.10.8 (tags/v3.10.8:aaaf517, Oct 11 2022, 16:50:30) [MSC v.1933 64 bit (AMD64)]

Context for the issue:

No response

@ceu1e ceu1e added the 00 - Bug label Oct 20, 2022
@LeonaTaric
Copy link
Contributor

well, how do you find the bug? (Such as "In unit test")

@mattip
Copy link
Member

mattip commented Oct 20, 2022

When using numpy your values are automatically cast to numpy scalars. The casting cannot follow the python rules about ints (no value is too big, silently convert to bignum form where the int is represented as a tuple of int values) and instead must use numpy rules. 2**64 -1 is too big for a np.int64, so the expression (2 **64 - 1) // np.int64(3) becomes np.double(2 ** 64 - 1) // np.int64(3)`.

@mattip mattip added 15 - Discussion 57 - Close? Issues which may be closable unless discussion continued and removed 00 - Bug labels Oct 20, 2022
@seberg
Copy link
Member

seberg commented Oct 20, 2022

Closing as duplicate of gh-12525 and many others. There are two "problems", one is our weird promotion, the other is that we use uint64 to begin with...

@seberg seberg closed this as completed Oct 20, 2022
@ceu1e
Copy link
Author

ceu1e commented Oct 20, 2022

Hmmmmm...... strange. Dividing bigger ints (> 2**64) by np.int64 operates nicely and smoothly and absolutely as expected: it returns an int with the exact result, mod operates fine as well and it is MUCH faster than the corresponding int/int divisions when your number fits in (which is why I love numpy ints for!).

If the int is smaller than 2**63, it is converted to np.int64 and returns exact results as well (and again much faster than the int by int64 division.

It is just the range from 263 to 264 where numpy decides to convert to float and completely looses the exactness. Since I'm interested in number theory, I absolutely need exact integer results... and my tests with numpy looked VERY promising...... until I factorized all numbers 2n-1, n=1...100 and got the correct results, EXCEPT for 264-1 where I got garbage. It took me a while to figure out what happened, and so I logged it as bug (which it still is in my eyes).

My best guess how to fix it was converting numbers > 263 (instead of > 264) to int and return the exact result (as for numbers > 2**64). This would create a consistent user experience... you always get exact results for int by int64 divisions and mods and really fast applications for number theory based on numpy would become possible.

Would something like this be possible?

@mattip
Copy link
Member

mattip commented Oct 20, 2022

Dividing bigger ints (> 2**64) by np.int64 operates nicely and smoothly and absolutely as expected

@seberg mentioned the real reason above

>> np.array(2**100).dtype
dtype('O')
>> np.array(2**50).dtype
dtype('int64')
>> np.array(2**64 - 1).dtype
dtype('uint64')

When you have object dtype (because the value overflows 64 bits), the division is done via python semantics, python successfully converts the np.int64(3) divisor to 3 and returns a python int.

When you have int64 divided by int64, you get back an int64.

When you have uint64 divided by uint64 there is no good common dtype, so you get back a float.

Since I'm interested in number theory

NumPy is an array processing library with C semantics, and is built for speed. Number theory is not a good fit for something that exposes many of the warts of C. Pure python on the other hand does have good integer arithmetic semantics. Why use NumPy in this case?

@mattip
Copy link
Member

mattip commented Oct 20, 2022

Also see NEP 50

@mattip
Copy link
Member

mattip commented Oct 20, 2022

If it is speed you want, and pure python code is what you are running, PyPy may work for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
15 - Discussion 57 - Close? Issues which may be closable unless discussion continued
Projects
None yet
Development

No branches or pull requests

4 participants