Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New "pattern can match invalid UTF-8" error in 1.8.0 #989

Closed
jplatte opened this issue Apr 24, 2023 · 1 comment
Closed

New "pattern can match invalid UTF-8" error in 1.8.0 #989

jplatte opened this issue Apr 24, 2023 · 1 comment
Labels

Comments

@jplatte
Copy link

jplatte commented Apr 24, 2023

What version of regex are you using?

1.8.1

Describe the bug at a high level.

In regex <1.8.0, it was possible to create a regex containing \W inside of a (?-u:) group. This no longer works and makes regex construction fail instead. Looks like we should have been using regex::bytes::Regex the whole time, but it hasn't been a problem so far and we were only searching within strs anyways.

What are the steps to reproduce the behavior?

[package]
name = "regex-bug"
version = "0.0.1"
edition = "2021"

[dependencies]
regex = "1.8"
use regex::Regex;

fn main() {
    Regex::new(r"(?-u:\W)").unwrap();
}

What is the actual behavior?

Crashes.

What is the expected behavior?

Doesn't crash. (not because that's more sensible, but because it has been possible for a very long time and people might be relying on it accidentally, like we did in Ruma)

@BurntSushi
Copy link
Member

Thanks for the report. This change was an explicit bug fix. See #895. It's mentioned in the CHANGELOG.

The previous behavior was incorrect, and it could result in runtime panics due to match spans that split a codepoint boundary. So it never really worked.

@BurntSushi BurntSushi closed this as not planned Won't fix, can't repro, duplicate, stale Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants