Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array out-of-bounds error #993

Closed
xizheyin opened this issue May 18, 2023 · 2 comments
Closed

Array out-of-bounds error #993

xizheyin opened this issue May 18, 2023 · 2 comments

Comments

@xizheyin
Copy link

What version of regex are you using?

The version is 1.8.1, the latest version.

Describe the bug at a high level.

When calling the function "regex::bytes::Regex::shortest_match_at", an array out-of-bounds error occurs if the parameter "start" is greater than the length of the parameter "text".

What are the steps to reproduce the behavior?

fn main() {
    let pattern = "xxx";
    let text = "This is a bug".as_bytes();
    let start = 20;

    let _local0: regex::bytes::RegexBuilder = regex::bytes::RegexBuilder::new(pattern);
    let _local1 = regex::bytes::RegexBuilder::build(&(_local0)).unwrap();
    let _ = regex::bytes::Regex::shortest_match_at(&(_local1), text, start);
}

What is the actual behavior?

Running the code above results in the following error message being obtained:

thread 'main' panicked at 'range start index 20 out of range for slice of length 13', /home/yxz/.cargo/registry/src/mirrors.ustc.edu.cn-61ef6e0cd06fb9b8/regex-1.8.1/src/exec.rs:753:28

The specific error is on line 753 of "exec.rs".

lits.find(&text[start..]).map(|(s, e)| (start + s, start + e))

When obtaining "&text[start..]", the program crashed because the length of "text" was only 13 while "start" was 20, causing an out-of-bounds error.
Here are the results of the backtrace analysis.
image
In order to ensure robustness, a normally running program should not terminate unexpectedly. According to the error message, we found that there was no restriction on the parameter named "start" in the call chain, which led to an array out-of-bounds error in the function "find_literal".

@xizheyin
Copy link
Author

Similarly, depending on the parameters, the function may enter different branches, and not restricting the "start" parameter may trigger a panic in other places.

For code,

fn main() {
    let pattern = "\\d+";
    let text = "This is a bug".as_bytes();
    let start = 15;
    let _local0: regex::bytes::RegexBuilder = regex::bytes::RegexBuilder::new(pattern);
    let _local1 = regex::bytes::RegexBuilder::build(&(_local0)).unwrap();
    let _ = regex::bytes::Regex::shortest_match_at(&(_local1), text, start);
}

The error message:

thread 'main' panicked at 'index out of bounds: the len is 13 but the index is 14', /home/yxz/.cargo/registry/src/mirrors.ustc.edu.cn-61ef6e0cd06fb9b8/regex-1.8.1/src/dfa.rs:1415:45

@BurntSushi
Copy link
Member

Duplicate of #738 and #972.

Yes, the docs need to be improved. But passing an invalid offset is indeed supposed to panic. Just like &slice[i..] panics when i > slice.len().

@BurntSushi BurntSushi closed this as not planned Won't fix, can't repro, duplicate, stale May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants