feat: Fixer for missing unicode flag in no-misleading-character-class #15279

mathiasvr · 2021-11-09T09:02:27Z

Prerequisites checklist

I have read the contributing guidelines.

What is the purpose of this pull request? (put an "X" next to an item)

[ ] Documentation update
[ ] Bug fix (template)
[ ] New rule (template)
[ ] Changes an existing rule (template)
[x] Add autofixing to a rule
[ ] Add a CLI option
[ ] Add something to the core
[ ] Other, please explain:

What changes did you make? (Give an overview)

Added autofixing for missing unicode "u" flag in case regex contains surrogate pair characters.

Is there anything you'd like reviewers to focus on?

eslint-github-bot · 2021-11-09T09:02:30Z

Hi @mathiasvr!, thanks for the Pull Request

The pull request title isn't properly formatted. We ask that you update the message to match this format, as we use it to generate changelogs and automate releases.

The length of the commit message must be less than or equal to 72

Read more about contributing to ESLint here

nzakas · 2021-11-20T01:41:55Z

Thanks for the PR. I’m not sure this fix is safe. I’m general, we don’t do fixes that change how code works because it can be difficult for end users to identify if it creates a problem. In this case, I fear that the change may fall into that category.

@eslint/eslint-tsc what do others think?

btmills · 2021-11-20T23:42:53Z

I agree. People often run autofix on save or as a commit hook, so we want to be extra careful about fixers. This seems like a good fit for a suggested fix.

mathiasvr · 2021-11-21T09:15:34Z

Auto fixing code as a commit hook sounds like a terrible idea since you generally want your commits to be in the state you committed them.

nzakas · 2021-11-25T01:04:02Z

Good point @btmills. @mathiasvr can you change this to a suggestion?

mathiasvr · 2021-11-25T01:14:22Z

@nzakas Do I just change the type to suggestion?

eslint/lib/rules/no-misleading-character-class.js

Line 103 in 36396ba

type: "problem",

Edit: Ah, I think I understand, change the fix to suggest.

nzakas · 2021-11-25T01:23:50Z

Exactly, this: https://eslint.org/docs/developer-guide/working-with-rules#providing-suggestions

This reverts commit 36396ba.

mathiasvr · 2021-11-25T01:51:04Z

Okay I changed it to a suggestion. Just wondering, why would you enable this rule and not want an auto fix but only a suggestion?

btmills

Implementation looks good! Can you add tests that verify the suggestion output is correct? For example:

{
    code: "var r = /[👍]/",
    errors: [{
        messageId: "surrogatePairWithoutUFlag",
        suggestions: [{ messageId: "suggestUnicodeFlag", output: "var r = /[👍]/u" }]
    }]
},

why would you enable this rule and not want an auto fix but only a suggestion?

For cases like this that would alter behavior, we want to bring them to the developer's attention as a suggestion instead of invisibly making the change in autofix, even if we're pretty sure the change is the one they want.

lib/rules/no-misleading-character-class.js

Add tests

btmills

Looks good! I like how you also added tests where there was already a non-u flag to make sure adding the Unicode flag didn’t clobber the existing flag(s).

nzakas

LGTM. Thanks!

mdjermanovic

I have two concerns about this change.

First, adding the u flag can produce syntactically invalid regular expressions. For example, this is valid without the flag:

var r = /[👍]\a/;

When user applies the suggestion, it will cause a parsing error because \a isn't allowed with the u flag:

var r = /[👍]\a/u; // Parsing error: Invalid regular expression: /[👍]\a/: Invalid escape

Parsing errors will be noticeable right away, but in the case of RegExp constructors, users might become aware of the errors caused by suggestions from this rule only when they run the code.

var r = new RegExp("[👍]\\a", "u"); // no parsing errors, but this will throw in runtime

Second, suggestions are allowed to change the behavior of the code, but that should generally be limited to the reported problem. Consider the following example:

// K is U+212A

var regex = /^(?:[👍]|\W!)$/i;

console.log(
    regex.test("👍") // false
);

console.log(
    regex.test("K!") // true
);

Adding the u flag fixes the problem with [👍], but since the flag applies to the whole regular expression, it also changes the behavior of \W, which isn't related to the reported problem:

// K is U+212A

var regex = /^(?:[👍]|\W!)$/iu;

console.log(
    regex.test("👍") // true
);

console.log(
    regex.test("K!") // false!
);

For that reason, I think that suggestions that add the u flag would be more suitable for the require-unicode-regexp rule, as proposed in #15089.

mathiasvr · 2021-12-02T17:14:00Z

@mdjermanovic I don't see a problem with allowing a suggestion to add the flag as changes might be necessary in either case. If wanting to allow this behaviour the rule should not be enabled. If not having suggestion for this rule, I think the require-unicode-regexp rule should also not have it since it will require manual fixes much more often.

mdjermanovic · 2021-12-02T22:30:36Z

I don't see a problem with allowing a suggestion to add the flag as changes might be necessary in either case. If wanting to allow this behaviour the rule should not be enabled.

My concern about this suggestion is its scope, because it adds a flag that applies to the whole regular expression, not just to the reported character class. The flag correctly fixes a specific problem reported by this rule, but it can also have side effects on the rest of the pattern, so it could introduce new problems that the user may not be aware of.

I think the require-unicode-regexp rule should also not have it since it will require manual fixes much more often.

That's a good point, maybe we shouldn't provide suggestions in that rule either, except when we're sure that the suggested fix doesn't require any further manual fixes (i.e., the behavior with and without the flag is same).

nzakas · 2021-12-03T01:27:55Z

My concern about this suggestion is its scope, because it adds a flag that applies to the whole regular expression, not just to the reported character class. The flag correctly fixes a specific problem reported by this rule, but it can also have side effects on the rest of the pattern, so it could introduce new problems that the user may not be aware of.

I’m not sure this is significant enough to change the path forward here. Suggestions were created specifically for situations where the fixes will potentially cause a change in behavior. In this case, we can’t limit the scope of the change, but it seems like the suggestion is still valid.

mdjermanovic · 2021-12-03T14:59:14Z

Suggestions were created specifically for situations where the fixes will potentially cause a change in behavior. In this case, we can’t limit the scope of the change, but it seems like the suggestion is still valid.

This fix is way beyond the scope of the reported problem. The suggestion message should at least indicate that the fix may have side effects on behavior that is unrelated to the reported problem, and also that it can affect the validity of the regular expression (assuming that we're not checking if the regex will be valid with the u flag, as currently implemented).

nzakas · 2021-12-15T01:46:11Z

It seems like we aren’t quite connecting on this. All suggestions have possible side effects, so it doesn’t make sense to mention that in a suggestion.

It sounds like you might be saying that the fix could create an invalid regular expression? Can you give an example of that so we can discuss?

@btmills you also approved this PR. What do you think?

mdjermanovic · 2021-12-15T22:39:04Z

It seems like we aren’t quite connecting on this. All suggestions have possible side effects, so it doesn’t make sense to mention that in a suggestion.

By side effects, I mean changes in behavior not related to the reported problem. I wouldn't expect that from a suggestion.

Consider this example, user wants a regex that matches "👍" or any string with length=3.

const regex = /^(?:[👍]|.{3})$/;

regex.test("👍"); // false. This is a bug, the problem is [👍]

regex.test("abc"); // true

regex.test("👶👶👶"); // false

After the suggestion is applied:

const regex = /^(?:[👍]|.{3})$/u;

regex.test("👍"); // true, the bug is fixed

regex.test("abc"); // true

regex.test("👶👶👶"); // true! This is a side effect, .{3} now works differently

It sounds like you might be saying that the fix could create an invalid regular expression? Can you give an example of that so we can discuss?

I left one example in #15279 (review), here's a similar one:

/[👍]\R/; // syntactically valid

/[👍]\R/u; // syntax error

Additionally, the second version could become valid in the future (proposal-regexp-r-escape), but with completely different semantics than the original one.

mathiasvr · 2021-12-15T22:54:08Z

@mdjermanovic I think this is why we now do it as a suggestion and not an automatic fix, so the side effects doesn't happen without explicitly applying it.

If a user enable this rule they should also want all the side effects as well, since they will ultimately have to disable it if not, no matter if it is auto-fixable, suggestion or just a warning.

mdjermanovic · 2021-12-16T00:31:51Z

If a user enable this rule they should also want all the side effects as well, since they will ultimately have to disable it if not, no matter if it is auto-fixable, suggestion or just a warning.

Not necessarily. There are other ways to fix the problem, for example [👍abc] -> (?:👍|[abc]).

I think this is why we now do it as a suggestion and not an automatic fix, so the side effects doesn't happen without explicitly applying it

Yes, but there's no notice about side effects. In the /^(?:[👍]|.{3})$/ example, the reported problem is that [👍] doesn't match with "👍". After the suggestion is applied, matching "👍" is the expected change in the behavior (not a side effect). All other changes, like matching 6 string characters with .{3}, may not be expected and the user may not be aware of them (those are side effects).

mathiasvr · 2021-12-16T01:16:14Z

The problem of not matching the emoji cannot be fixed without the u flag right? We agree that the suggestion of adding that flag may not be enough to fix the whole regex, but how will you avoid a lint error without disabling the rule or adding the flag?
I mean if the user wants the rule enabled they have to fix the regex somehow right?

nzakas · 2021-12-16T01:26:36Z

I don’t see the introduction of side effects as a problem here. As already discussed, suggestions are allowed to have side effects and it’s up to the end user to verify that their code still works. Autofixes cannot have side effects.

If there is a more narrow fix to suggest, we should also suggest that. Part of the benefit of suggestions is that we can specify multiple options.

We definitely shouldn’t suggest something that is syntactically invalid. I was under the impression that the earlier comment had already been addressed. If that’s not the case, we should fix that.

nzakas · 2021-12-16T01:31:41Z

TSC Summary: This PR seeks to add suggestions to no-misleading-character-class, specifically adding the “u” flag if it would result in a valid regular expression. Both nzakas and btmills have approved the PR, mdjermanovic has concerns about the side effects of the suggestions. We do allow side effects in suggestions in core rules.

TSC Question: Do we want to change our expectations around suggestions? (https://eslint.org/docs/developer-guide/working-with-rules#providing-suggestions) Can we merge this?

mdjermanovic · 2021-12-16T17:12:35Z

The problem of not matching the emoji cannot be fixed without the u flag right?

It can be fixed by moving the character out of the class.

const regex = /^(?:👍|.{3})$/;

regex.test("👍"); // true, the bug is fixed

regex.test("abc"); // true

regex.test("👶👶👶"); // still false

This fix changes behavior related to the reported problem, but without side effects on things that are not considered problems by this rule.

There is a similar plugin rule regexp/no-misleading-unicode-character and its opt-in autofix works this way, here are test cases.

mdjermanovic · 2021-12-20T14:12:36Z

Per 16-December-2021 TSC Meeting, this PR is accepted. The conclusion is that suggestions are allowed to change the behavior of the code even when the said behavior isn't directly related to the reported problem, and that there is no need to add a note about possible side effects in the suggestion message. On the other hand, suggestions that produce syntactically invalid code should be avoided.

mdjermanovic

Per the previous comment, we should check if the regex would be syntactically valid with the u flag.

For example, rule shouldn't provide the u flag suggestion for the following regex literal:

/^[👍]\a$/;

For that purpose, we can use regexpp.RegExpValidator. Here's an example in prefer-regex-literals rule.

lib/rules/no-misleading-character-class.js

mdjermanovic · 2021-12-20T16:01:57Z

lib/rules/no-misleading-character-class.js

+                            if (node.arguments.length === 1) {
+                                return fixer.insertTextAfterRange(patternNode.range, ', "u"');
+                            }


First argument may be parenthesised:

new RegExp(("[👍]")); // regex is /[👍]/

In that case, the suggested fix wouldn't work as intended:

new RegExp(("[👍]", "u")); // regex is /u/

mdjermanovic · 2021-12-20T16:21:55Z

If there is a more narrow fix to suggest, we should also suggest that. Part of the benefit of suggestions is that we can specify multiple options.

This was also discussed in the TSC meeting, and we agreed that it isn't necessary for this PR. It's fine to implement only the u flag suggestion for the start, and (maybe) consider adding other suggestions later.

nzakas · 2022-02-04T01:55:07Z

@mathiasvr are you still working on this?

Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com>

mathiasvr · 2022-02-04T16:19:25Z

@nzakas Sorry, I don't have time to look into finishing the requested changes at the moment.

mdjermanovic · 2022-05-11T14:15:15Z

Continued in #15867

mathiasvr added 3 commits November 9, 2021 08:57

refactor: Simplify kinds tracking with a Set

9ff21fb

feat: Add fixer for missing regex unicode flag

81f6c21

test: Update tests

36396ba

eslint-github-bot bot added the triage An ESLint team member will look at this issue soon label Nov 9, 2021

mathiasvr changed the title ~~feat: Add fixer for missing unicode flag in no-misleading-character-class rule~~ feat: Fixer for missing unicode flag in no-misleading-character-class Nov 12, 2021

mathiasvr added 2 commits November 25, 2021 02:31

refactor: Change fix to suggestion

2f8f039

Revert "test: Update tests"

de31e68

This reverts commit 36396ba.

btmills requested changes Nov 28, 2021

View reviewed changes

lib/rules/no-misleading-character-class.js Outdated Show resolved Hide resolved

Address review comments

a69a73b

Add tests

btmills approved these changes Nov 30, 2021

View reviewed changes

nzakas approved these changes Dec 2, 2021

View reviewed changes

mdjermanovic added evaluating The team will evaluate this issue to decide whether it meets the criteria for inclusion feature This change adds a new feature to ESLint rule Relates to ESLint's core rules and removed triage An ESLint team member will look at this issue soon labels Dec 2, 2021

mdjermanovic reviewed Dec 2, 2021

View reviewed changes

nzakas added the tsc agenda This issue will be discussed by ESLint's TSC at the next meeting label Dec 16, 2021

mdjermanovic requested changes Dec 20, 2021

View reviewed changes

btmills mentioned this pull request Dec 28, 2021

Add warning for suggested changes to rule docs eslint/archive-website#899

Open

Update lib/rules/no-misleading-character-class.js

6a64d01

Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com>

mdjermanovic mentioned this pull request Feb 6, 2022

feat: Fix suggestion for "no-template-curly-in-string" #15574

Closed

1 task

mdjermanovic mentioned this pull request Feb 15, 2022

Rule Change: Suggestions for require-unicode-regexp #15089

Closed

1 task

mdjermanovic mentioned this pull request May 11, 2022

feat: add Unicode flag suggestion in no-misleading-character-class #15867

Merged

1 task

mdjermanovic closed this May 11, 2022

eslint-github-bot bot locked and limited conversation to collaborators Nov 8, 2022

eslint-github-bot bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Nov 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Fixer for missing unicode flag in no-misleading-character-class #15279

feat: Fixer for missing unicode flag in no-misleading-character-class #15279

mathiasvr commented Nov 9, 2021

eslint-github-bot bot commented Nov 9, 2021

nzakas commented Nov 20, 2021

btmills commented Nov 20, 2021

mathiasvr commented Nov 21, 2021

nzakas commented Nov 25, 2021

mathiasvr commented Nov 25, 2021 •

edited

nzakas commented Nov 25, 2021

mathiasvr commented Nov 25, 2021

btmills left a comment

btmills left a comment

nzakas left a comment

mdjermanovic left a comment

mathiasvr commented Dec 2, 2021 •

edited

mdjermanovic commented Dec 2, 2021

nzakas commented Dec 3, 2021

mdjermanovic commented Dec 3, 2021

nzakas commented Dec 15, 2021

mdjermanovic commented Dec 15, 2021

mathiasvr commented Dec 15, 2021 •

edited

mdjermanovic commented Dec 16, 2021

mathiasvr commented Dec 16, 2021

nzakas commented Dec 16, 2021

nzakas commented Dec 16, 2021

mdjermanovic commented Dec 16, 2021

mdjermanovic commented Dec 20, 2021

mdjermanovic left a comment

mdjermanovic Dec 20, 2021

mdjermanovic commented Dec 20, 2021

nzakas commented Feb 4, 2022

mathiasvr commented Feb 4, 2022

mdjermanovic commented May 11, 2022

feat: Fixer for missing unicode flag in no-misleading-character-class #15279

feat: Fixer for missing unicode flag in no-misleading-character-class #15279

Conversation

mathiasvr commented Nov 9, 2021

Prerequisites checklist

What is the purpose of this pull request? (put an "X" next to an item)

What changes did you make? (Give an overview)

Is there anything you'd like reviewers to focus on?

eslint-github-bot bot commented Nov 9, 2021

nzakas commented Nov 20, 2021

btmills commented Nov 20, 2021

mathiasvr commented Nov 21, 2021

nzakas commented Nov 25, 2021

mathiasvr commented Nov 25, 2021 • edited

nzakas commented Nov 25, 2021

mathiasvr commented Nov 25, 2021

btmills left a comment

Choose a reason for hiding this comment

btmills left a comment

Choose a reason for hiding this comment

nzakas left a comment

Choose a reason for hiding this comment

mdjermanovic left a comment

Choose a reason for hiding this comment

mathiasvr commented Dec 2, 2021 • edited

mdjermanovic commented Dec 2, 2021

nzakas commented Dec 3, 2021

mdjermanovic commented Dec 3, 2021

nzakas commented Dec 15, 2021

mdjermanovic commented Dec 15, 2021

mathiasvr commented Dec 15, 2021 • edited

mdjermanovic commented Dec 16, 2021

mathiasvr commented Dec 16, 2021

nzakas commented Dec 16, 2021

nzakas commented Dec 16, 2021

mdjermanovic commented Dec 16, 2021

mdjermanovic commented Dec 20, 2021

mdjermanovic left a comment

Choose a reason for hiding this comment

mdjermanovic Dec 20, 2021

Choose a reason for hiding this comment

mdjermanovic commented Dec 20, 2021

nzakas commented Feb 4, 2022

mathiasvr commented Feb 4, 2022

mdjermanovic commented May 11, 2022

mathiasvr commented Nov 25, 2021 •

edited

mathiasvr commented Dec 2, 2021 •

edited

mathiasvr commented Dec 15, 2021 •

edited