Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove performance cliff for regex #342

Merged
merged 1 commit into from
Jan 4, 2022
Merged

fix: remove performance cliff for regex #342

merged 1 commit into from
Jan 4, 2022

Conversation

Trott
Copy link
Contributor

@Trott Trott commented Jan 3, 2022

What is the purpose of this pull request?

Fixes a regex-backtracking performance cliff for some pathological cases.

What changes did you make? (Give an overview)

Replaced .* in two places with matching lists [^(] and [^|]. Wrote tests to confirm performance cliffs in current code and their elimination with this change.

@Trott Trott marked this pull request as draft January 3, 2022 17:24
@Trott
Copy link
Contributor Author

Trott commented Jan 3, 2022

Converted to draft because there are some weird edge cases that will have their results changed with what is currently here. For example, '(!(|))' matches the previous regex but not the new one. Should be pretty easy to fix I hope/think.

@Trott
Copy link
Contributor Author

Trott commented Jan 3, 2022

Converted to draft because there are some weird edge cases that will have their results changed with what is currently here. For example, '(!(|))' matches the previous regex but not the new one. Should be pretty easy to fix I hope/think.

Hmmm...actually, it looks like it's fine because all the ones that no longer match will still match GLOB_EXTENSION_SYMBOLS_RE so there's no change in behavior as far as the end user is concerned.

There's probably an opportunity to simplify/consolidate those regexes. If it is done carefully, it might be possible to do it in a way that positively impacts performance.

@Trott Trott marked this pull request as ready for review January 3, 2022 17:45
@mrmlnc mrmlnc self-requested a review January 4, 2022 17:19
@mrmlnc mrmlnc self-assigned this Jan 4, 2022
@mrmlnc mrmlnc added this to the 3.2.8 milestone Jan 4, 2022
Copy link
Owner

@mrmlnc mrmlnc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Right now we can skip optimizations of these regular expressions, since they are used at the earliest stage of analysis patterns: managers/tasks to separate patterns by type (static/dynamic) and providers/matchers/matcher to determinate the dynamic segment of the patterns.

@mrmlnc
Copy link
Owner

mrmlnc commented Jan 4, 2022

Thanks again, @Trott 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants