Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adjust inefficient regular expression #336

Merged
merged 1 commit into from
Jan 2, 2022
Merged

fix: adjust inefficient regular expression #336

merged 1 commit into from
Jan 2, 2022

Conversation

Trott
Copy link
Contributor

@Trott Trott commented Dec 17, 2021

What is the purpose of this pull request?

Fix a potential performance cliff in pathological case.

What changes did you make? (Give an overview)

Matching large numbers of repetitions of [ can take a minute or more in the
current code. This change gets it back down into the milliseconds as
expected.

Instead of matching "open bracket followed by whatever and then a close bracket", this change has the regexp ignore repeated open brackets to avoid polynomial backtracking.

@mrmlnc mrmlnc self-assigned this Dec 17, 2021
Fix a potential performance cliff in pathological case. Matching
large numbers of repetitions of `[` can take a minute or more in the
current code. This change gets it back down into the milliseconds as
expected.
@XhmikosR
Copy link
Contributor

For what is worth I suggest adding https://lgtm.com/projects/g/mrmlnc/fast-glob?mode=list or CodeQL (this is the successor of LGTM).

@Trott
Copy link
Contributor Author

Trott commented Dec 29, 2021

For what is worth I suggest adding https://lgtm.com/projects/g/mrmlnc/fast-glob?mode=list or CodeQL (this is the successor of LGTM).

I use LGTM and my plan was to fix those other two after this one lands. I figured changing one regular expression at a time is easier for people to review and feel comfortable landing. Changing a whole bunch at once can be challenging to review/test sometimes.

Aside from the tools you mention, @meekdenzo is working (when he has time) on creating a GitHub Action that uses https://github.com/davisjam/vuln-regex-detector to detect these things in pull requests. Adding that as part of a GitHub Action test suite might be more effective than CodeQL. I have CodeQL on many repositories and it has never flagged anything (or if it does flag stuff, it does it in a way I don't notice).

@XhmikosR
Copy link
Contributor

XhmikosR commented Dec 29, 2021 via email

@XhmikosR
Copy link
Contributor

XhmikosR commented Dec 30, 2021

All right, I just tested this quickly and it correctly flags the same stuff LGTM does (which was expected):

image

I'd say it's definitely worth adding CodeQL later to the repo.

EDIT: I opened #338

@XhmikosR XhmikosR mentioned this pull request Dec 30, 2021
@@ -9,7 +9,7 @@ const GLOBSTAR = '**';
const ESCAPE_SYMBOL = '\\';

const COMMON_GLOB_SYMBOLS_RE = /[*?]|^!/;
const REGEX_CHARACTER_CLASS_SYMBOLS_RE = /\[.*]/;
const REGEX_CHARACTER_CLASS_SYMBOLS_RE = /\[[^[]*]/;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to test is to make sure this handles [ inside of brackets as expected. If provided '[[]', the current regex matches '[[]' (the whole string), but the regex proposed here only matches the last two characters ('[]'). I don't know if one or the other would be considered a bug, or if '[[]' is malformed and the behavior there is undefined.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal of this regex is to say that the pattern contains a character classes ([1-5], [:alnum:], …). Based on this statement, I think it is enough here to check that there is something enclosed in square brackets in the pattern. We don't need to check that the pattern is correct (syntax and nesting checking). Pattern validation is a separate layer delegated to the micromatch package. In this case, your proposed solution is correct.

@mrmlnc mrmlnc added this to the 3.2.8 milestone Jan 2, 2022
@mrmlnc
Copy link
Owner

mrmlnc commented Jan 2, 2022

Looks good for me. Thanks for the contributing 🎉

@mrmlnc mrmlnc merged commit 92b68d9 into mrmlnc:master Jan 2, 2022
@Trott Trott deleted the dere branch January 2, 2022 19:28
@XhmikosR
Copy link
Contributor

XhmikosR commented Jan 3, 2022

It probably makes sense to try to fix the other issues too and

  1. issue advisories
  2. maybe backport them to 2.x

@mrmlnc
Copy link
Owner

mrmlnc commented Jan 3, 2022

I don't think that this is a real security issue and that it needs to be backported to the previous major version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants