Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences from paper #119

Open
simonbyrne opened this issue Oct 2, 2019 · 6 comments
Open

Differences from paper #119

simonbyrne opened this issue Oct 2, 2019 · 6 comments

Comments

@simonbyrne
Copy link

I noticed there are some differences in the constants from what the paper would suggest. For example,

ryu/ryu/d2s.h

Lines 40 to 41 in 5c36146

#define DOUBLE_POW5_INV_BITCOUNT 122
#define DOUBLE_POW5_BITCOUNT 121

whereas the paper suggests these should both be 124 (Figure 4). Why the difference?

@ulfjack
Copy link
Owner

ulfjack commented Oct 3, 2019

Hi Simon!

That is an excellent question. The algorithm in the paper returns a conservative value for the required number of bits, not a tight bound. I have been trying to come up with tighter bounds, and I believe these are safe, though I don't have a published proof that they are. You might think that tighter bounds aren't necessary, as long as we're below 128 bits. However, tighter bounds allows us to use smaller lookup tables because we can trade bits for table size. Unfortunately, I didn't have time yet to implement that.

Makes sense?

-- Ulf

@simonbyrne
Copy link
Author

Okay, thanks!

@mrmixer
Copy link

mrmixer commented Dec 18, 2020

Hi,
I'm trying to implement a simple version of the algorithm. My goal is to have a single file header (something like the stb libs that I can include or copy paste in any small project I'm working on (one of which I would like to publish in the public domain). So I care more about small amount of code, with no dependency (I would like it to compile without the CRT) and no license than performances. That was just for context.

I'm trying to implement the "step 0" for chapter 3.4 from the paper, more precisely the part where we build a lookup table for exponents less than 0. I'll make it clear that I'm not good at math and although I spent a lot of time to understand the math up to chapter 3.2, I'm not at ease with a lot of it.

From Ryū: fast float-to-string conversion:

For each possible value of e2 that is less than zero, we determine q as max( 0, ⌊−e2 log5 2⌋ − 1) (Lemma 3.2), and then use B1 to determine a legal value for k as (⌈log2 5q⌉ −B1) (Lemma 3.4). We compute ⌊5−e2−q/2k⌋ using arbitrary precision arithmetic - this result has no more than B1 bits - and store it in a lookup table TABLE_LT indexed by −e2 − q.

I tried that (for 64bit floating point numbers) but the values to store in the look up table seem to be more than 124 bits. So I guess I'm doing something wrong (the result for positive exponent seem to fit in 124 bit which confuses me even more).
For example:
If e2 is -1023 (the exponent in the number was 0, minus the bias 1023) and B1 = 124, than we get
q = floor( 1023 * log5(2) ) - 1 = 439
k = ceil( log2( 5^q ) ) - 124 = 1020 - 124 = 896
result = floor( ( 5^(1023-q) ) / ( 2^k ) ) = 1,5793650827938261354e+408 / 5,28294531135665246352e+269 = 2,98955410232752824319e+138

And result doesn't fit into 124 bits.

What am I doing wrong ?

@rick-masters
Copy link

@mrmixer
I know it has been a long time since your question, but I'm going to answer it anyway in case you still care or perhaps someone else will find it useful. I also think Mr. Adams should be aware of the issues raised below.

The problem with your calculation is that you've made a couple of mistakes and you are using formulas which, although taken directly from the paper, are inconsistent with other parts of the paper and the code. I believe I can demonstrate the "right" formulas.

I should first note that I'm just an amateur lurker and do not speak for Mr. Adams.
I base the following formulas and calculations based on what is in the Java and C code, which I consider to be the actual "source of truth".

I will use python3 to perform the calculation.
To follow along you can enter the commands I have prefixed by the '>>>' python3 prompt.
Start by importing the math library.

$ python3
Python 3.8.10 (default, May 26 2023, 14:05:08) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import math

The 64-bit floating point exponent bias is 1023.

>>> BIAS = 1023

First, I'm going to assume that you want to perform the calculation for a very small number -
with the exponent encoded as zero, which is the smallest possible encoded exponent.

>>> e = 0

Note that if e == 0 then the decoded number will have an exponent of (1 - bias) not (e - bias) from your example.

>>> exp = 1 - BIAS if e == 0 else e - BIAS

Following the paper, we calculate ef by subtracting len(m) from the adjusted exponent. (You forgot this.)
len(m) is the length of the mantissa in bits.

>>> MANTISSA_BITS = 52
>>> ef = exp - MANTISSA_BITS

Then we subtract 2 from ef to get e2, which is -1076 in this example.

>>> e2 = ef - 2
>>> print(e2)
-1076

The rest of this analysis assumes e2 < 0.

Now we compute q. Here is where the paper is inconsistent and we need to consult the code (see below) to resolve this.
The "wrong" formula is in both Lemma 3.4 and section 3.4: q = ⌊-e2 * log₅2⌋
The "correct" (but incomplete) formula appears to be found in Lemma 3.2: q = ⌊-e2 * log₁₀5⌋

This inconsistency is also reported in Issue 202 (no response):
#202

We incorporate aspects of the formula from the code and from section 3.4: q = max(0, ⌊-e2 * log₁₀5⌋ - 1)
And we match the optimization from the following C code to subtract 1 only if -e2 > 1:

ryu/ryu/d2s.c

Lines 158 to 159 in cc41df9

// This expression is slightly faster than max(0, log10Pow5(-e2) - 1).
const uint32_t q = log10Pow5(-e2) - (-e2 > 1);

Here is the final formula for q:

>>> q = math.floor( -e2 * math.log(5, 10) ) - (1 if -e2 > 1 else 0)
>>> print(q)
751

Now we compute k. Again, the paper is inconsistent. The code calculates it here:

ryu/ryu/d2s.c

Lines 161 to 162 in cc41df9

const int32_t i = -e2 - (int32_t) q;
const int32_t k = pow5bits(i) - DOUBLE_POW5_BITCOUNT;

So we see it is ⌈log₂ (5 ^ (-e2 - q))⌉ - B1.

An inconsistent formula is given in Section 3.4 in both Step 0' and Step '3: ⌈log₂ (5 ^ q)⌉ - B1
The 5^q term can also be found in Lemma 3.1 for the multiplier where it is inconsistent with the multiplier 5^(-e2 -q) found in Section 3.2, which I believe to be correct. So, perhaps Lemma 3.1 was the origin of the inconsistent term which was then carried over to the formula for k in section 3.4.

Note that B1 is 125 in the code but it is 124 in the paper. I'm not sure why there is a difference by I doubt it is important.

>>> B1 = 125
>>> i = -e2 - q
>>> print(i)
325
>>> k = math.ceil( math.log( 5 ** i , 2) ) - B1
>>> print(k)
630

Now we can calculate the multiplier: 5^(-e2 - q) / 2^k

To stick with integer math, we avoid negative exponents:

>>> if k > 0:
...     multiplier = (5 ** i) // ( 2 ** k )
... else:
...     multiplier = (5 ** i) * ( 2 ** -k )
...
>>> print(multiplier)
32836294410387009994688234313321054992

Let's print the upper and lower bits and the length of the multiplier in bits:

>>> print("multiplier >> 64 is %d" % (multiplier // (2 ** 64)))
multiplier >> 64 is 1780059086805761106
>>> print("multiplier %% 2**64 is %d" % (multiplier % (2 ** 64)))
multiplier % 2**64 is 8710297504448807696
>>> if multiplier > 0:
...     print("# of multiplier bits is %d" % math.ceil(math.log(multiplier, 2)))
...
# of multiplier bits is 125

Note that i (which is 325) is the index into the lookup table.
Note that the numbers above match the last entry (number 325) in the table:

{ 3278889188817135834u, 1424047269444608885u }, { 8710297504448807696u, 1780059086805761106u }

Also note that the multiplier will always have 125 bits, by design:

// By construction, this will have exactly POW5_BITCOUNT bits. Note that this can shift left if j is negative!
BigInteger pow5DivPow2 = pow.shiftRight(pow5len - POW5_BITCOUNT);

So, these calculations are consistent with the code and resulted in the exact same numbers that can be found in the lookup table. So, I'm fairly confident this is the correct method, at least for e2 < 0.

Here all the commands together so you can just cut and paste them all into python3:

import math
BIAS = 1023
e = 0
exp = 1 - BIAS if e == 0 else e - BIAS
MANTISSA_BITS = 52
ef = exp - MANTISSA_BITS
e2 = ef - 2
assert(e2 < 0)
print(e2)
q = math.floor( -e2 * math.log(5, 10) ) - (1 if -e2 > 1 else 0)
print(q)
B1 = 125
i = -e2 - q
print(i)
k = math.ceil( math.log( 5 ** i , 2) ) - B1
print(k)
if k > 0:
    multiplier = (5 ** i) // ( 2 ** k )
else:
    multiplier = (5 ** i) * ( 2 ** -k )

print(multiplier)
print("multiplier >> 64 is %d" % (multiplier // (2 ** 64)))
print("multiplier %% 2**64 is %d" % (multiplier % (2 ** 64)))
if multiplier > 0:
    print("# of multiplier bits is %d" % math.ceil(math.log(multiplier, 2)))


@mrmixer
Copy link

mrmixer commented Jun 27, 2023

@rick-masters Thanks for the reply. I can't say I remember anything from the paper, but I appreciate you taking the time to reply in case I ever want to look back at it. I ended up porting dragonbox into a single C file.

@Ariel-Burton
Copy link

See also issue 233.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants