Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seq 4e4000003 4e4000003 is causing an infinite loop #6182

Open
sylvestre opened this issue Apr 2, 2024 · 9 comments
Open

seq 4e4000003 4e4000003 is causing an infinite loop #6182

sylvestre opened this issue Apr 2, 2024 · 9 comments
Labels

Comments

@sylvestre
Copy link
Sponsor Contributor

sylvestre commented Apr 2, 2024

Looking at seq oss fuzz coverage, I noticed that it wasn't producing much.
Writing a fuzz on seq parse number, i noticed that:
cargo run seq 4e4000003 4e4000003
is running forever

while GNU is doing:

$ LANG=C /usr/bin/seq 4e4000003 4e4000003
/usr/bin/seq: invalid floating point argument: '4e4000003'
@sylvestre sylvestre changed the title seq 4e4000003 4e4000003 is causing an infinite loop seq 4e4000003 4e4000003 is causing an infinite loop Apr 2, 2024
@sylvestre
Copy link
Sponsor Contributor Author

Identified with #6183

@maxer137
Copy link
Contributor

maxer137 commented Apr 3, 2024

It appears that this might be an issue with the BigDecimal crate.
When trying to parse the argument, it will convert the string 4e4000003 into a string with a four followed by 4000003 zeroes.
It will never make it past numbers[0].parse

@sylvestre
Copy link
Sponsor Contributor Author

yes, it is ;)
you should write a test case and report a bug in the crate

@maxer137
Copy link
Contributor

maxer137 commented Apr 3, 2024

But it seems like uutils/coreutils does the string conversion. For some reason, we're turning 4e4000003 into a string with a four followed by 4000003 zeroes? The issue still ends up at the BigDecimal crate not being able to handle that string. But that string we generate is still about 4MB in memory for parsing a number. Seems excessive

@sylvestre
Copy link
Sponsor Contributor Author

yeah, maybe we are doing something wrong but bigdecimal might want to reject it directly

@sylvestre
Copy link
Sponsor Contributor Author

@maxer137 did you have a chance to look into this a bit more?

@Carbrex
Copy link
Contributor

Carbrex commented Apr 11, 2024

I tried running this command

$ time cargo run seq 4e4000003 4e4000003 > out.txt
   Compiling coreutils v0.0.26 (/home/carbrex/uutils/coreutils)
    Finished dev [unoptimized + debuginfo] target(s) in 6.31s
     Running `target/debug/coreutils seq 4e4000003 4e4000003`

real    15.59s
user    14.28s
sys     1.25s
cpu     99%

This is the output. So it isn't specifically an infinite loop but takes much time to run, still a deviation from gnu though.

@Carbrex
Copy link
Contributor

Carbrex commented Apr 11, 2024

Upon further investigation I found that seq supports floating point only upto f128(128 bit floating point number). For example, seq 11e4931 11e4931 works but seq 12e4931 12e4931 throws an error.

@maxer137
Copy link
Contributor

Indeed. Using BigDecimal we are able to go up to much larger values than originally in GNU seq.

I have removed the zero padding from parse_decimal_and_exponent and parse_decimal_no_exponent in #6185

This seems to still allow very large numbers such as 4e4000003 to work but it will be very slow.
This seems to be due to us comparing two very large BigDecimal numbers.
Looking at the bigdecimal crate they are aware of this as shown in this issue

We could decide to either parse the value into an f128 value and then either reject or accept depending on if it is a valid value, or we accept that the uutils implementation can go above the ranges of GNU’s implementation.
This would deviate from GNU, but I feel like there is a case to be made for allowing to extend the range seq supports.

sylvestre added a commit to sylvestre/coreutils that referenced this issue Apr 14, 2024
sylvestre added a commit that referenced this issue Apr 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Status: No status
Development

No branches or pull requests

3 participants