Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(v14): Make character class sorting in regex-shorthand optional or at least documented #453

Closed
brettz9 opened this issue Nov 30, 2019 · 6 comments · Fixed by #468
Closed

Comments

@brettz9
Copy link
Contributor

brettz9 commented Nov 30, 2019

Hi,

With the latest 14.0.0, I'm getting the following false positive for regex-shorthand (besides those reported).

The rule is expecting sorting within character classes, though this is not advertised as being part of the rule in the docs. E.g., /#([^)'"]+)/ can be optimized to /#([^"')]+)/ . I think this should be either advertised as such, or, better yet, made into an option, since it is sometimes clearer to make one's own groupings without sorting, e.g., [AaQqTt].

Thanks!

@brettz9 brettz9 changed the title (v14): (v14): Make character class sorting in regex-shorthand optional or at least reported Nov 30, 2019
@brettz9 brettz9 changed the title (v14): Make character class sorting in regex-shorthand optional or at least reported (v14): Make character class sorting in regex-shorthand optional or at least documented Nov 30, 2019
@fisker
Copy link
Collaborator

fisker commented Nov 30, 2019

We are using regexp-tree to optimize them since 14.0.0, unfortunately seems there is no option to avoid sorting character class https://github.com/DmitrySoshnikov/regexp-tree/blob/master/src/optimizer/README.md#transforms

I guess we can document this.

@sindresorhus
Copy link
Owner

Yes, let's document it.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 1, 2019

@fisker : Thank you for the background info.

I've filed DmitrySoshnikov/regexp-tree#199 in the hopes it may be possible to configure in the future.

@voxpelli
Copy link
Contributor

voxpelli commented Dec 3, 2019

I agree that this is problematic.

One part of it is that different languages sort characters in different ways. Eg. in Swedish åäö is last in the alphabet and in that order and this rule eg. wants to resort åä to äå. To makes matter even more complicated, German sorts those letters after their dot-less equivalents, so ä after a and ö after o – so one universal true rule is not really possible when sorting purely on letters, unless one wants to enforce a "true" sorting upon everyone else.

I've now disabled this rule on our projects, which is a pity, because it has lots of other nice things to it than the sorting.

@voxpelli
Copy link
Contributor

voxpelli commented Dec 3, 2019

I did find a way to deactivate character sorting @fisker, you can find it in #468

@fungiboletus
Copy link

fungiboletus commented Dec 3, 2019

May I suggest to disable character class sorting by default ? The optimisation seems unnecessary and it makes regexps less readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants