Comparing numbers can take a surprising amount of time #125

BenWiederhake · 2024-03-29T00:46:53Z

I understand that comparing numbers that are very similar might take a bit of time, as it might need to read all stored digits in the worst case. But certainly there is no expensive computation involved, right?

However, calling cmp always has a large overhead in CPU and memory. Example:

#[test]
fn test_compare_extreme() {
    let teenytiny = BigDecimal::from_str("1e-9223372036854775807").unwrap();
    let one = BigDecimal::from_str("1").unwrap();
    assert!(one > teenytiny);
    assert!(teenytiny < one);
}

Note that creating teenytiny and one is very quick: The internal BigInt and wrapping BigDecimal require barely any memory at all. However, the comparisons are excruciatingly slow, because:

impl Ord for BigDecimal {
    fn cmp(&self, other: &BigDecimal) -> Ordering {
        // [… other code …]
        let tmp = self - other;
        // [… other code …]
    }
}

This subtraction effectively runs an infinite loop, even though cmp could be determined (for this input) very easily.
Implementing some kind of early-return path would be really nice.

Related: uutils/coreutils#3318

The text was updated successfully, but these errors were encountered:

akubera · 2024-03-29T02:33:26Z

Yeah that can be done.

akubera · 2024-03-29T02:50:28Z

PartialEqual has a "scale difference" issue, too:

  let scaled_int_val = &self.int_val * ten_to_the((rhs.scale - self.scale) as u64);

Where that subtraction could cause an out-of-memory error allocating 10^(terabyte-sized-number)

(and by terabyte I mean exabyte.....)

akubera · 2024-03-29T03:04:58Z

Oh no, that was the old version. It was fixed in September. I probably meant to fix impl Ord too but slipped by me.

akubera · 2024-03-29T06:42:52Z

Implementation is ready, and I included your example test. I'd like to write a few more tests to improve the coverage.

trunk...f/ImprovedOrd

It now tries to cast the big ints to u64, then u128, then creates Vec of the digits, so the amount of memory required is only dependent on number of digits, not the scale. And no allocations if the value is under 40 digits long.

akubera · 2024-03-29T07:12:58Z

I'm calling it for the night, but leaving myself a message:

example : cmp(1e-9223372036854775807, 1e+9223372036854775807) panics as the difference of exponents is out of i64 domain.

I'm not sure what to do in that case. I would guess that it's safe to assume they're not equal, as that would mean exabytes of numbers to reach equality, and go with less Less/Greater base on which scale is larger.

akubera · 2024-03-29T18:24:56Z

That was fixed with a new checked_diff(i64,i64) -> (Ordering, Option<u64>) function. If the difference between the two numbers is greater than a u64, we return None in the tuple. Since we're comparing scales (i.e. number of digits) then we're safe to assume if that number doesn't fit in a u64, we can know the ordering/equality of the numbers without needing to inspect them digit-by-digit.

akubera · 2024-03-29T18:37:01Z

Testing at the boundaries here lead to finding a similar panic in implementation of eq. That too has been fixed and I think should be faster with a non-allocating algorithm if the scale is <20 digits (max number of decimal-digits which fits in a u64).

akubera · 2024-03-29T19:16:24Z

Now the hard part is making contrived test cases which avoid the optimizations.

akubera · 2024-03-29T20:25:18Z

I'm happy with result. It's now merged into trunk in commit b34622d.

Let me know if that works for you.

BenWiederhake · 2024-03-29T20:41:50Z

Fantastic! It works great :D

maxer137 mentioned this issue Apr 11, 2024

seq 4e4000003 4e4000003 is causing an infinite loop uutils/coreutils#6182

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing numbers can take a surprising amount of time #125

Comparing numbers can take a surprising amount of time #125

BenWiederhake commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024 •

edited

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024 •

edited

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

BenWiederhake commented Mar 29, 2024

Comparing numbers can take a surprising amount of time #125

Comparing numbers can take a surprising amount of time #125

Comments

BenWiederhake commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024 • edited

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024 • edited

akubera commented Mar 29, 2024

akubera commented Mar 29, 2024

BenWiederhake commented Mar 29, 2024

akubera commented Mar 29, 2024 •

edited

akubera commented Mar 29, 2024 •

edited