Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with regexp literals in JS #2154

Closed
astynk opened this issue Jan 4, 2020 · 2 comments
Closed

Problems with regexp literals in JS #2154

astynk opened this issue Jan 4, 2020 · 2 comments

Comments

@astynk
Copy link

astynk commented Jan 4, 2020

In Javascript, regexp literals followed by any operator or comment, are not highlighted properly. Try these:

let a = /regex/ // comment
let b = condition ? /regex/ : /another one/

@astynk
Copy link
Author

astynk commented Jan 5, 2020

Yet another bug I just found:

if (a) /regex/

When a regexp literal is preceded by a closing parenthesis, the script parses it as a division operator.
I'm currently writing a syntax highlighter, too, and these regexp literals are just pain in the neck. It seems you'll have to do a complete analysis of source code (like browsers do) to handle all possible cases.

@RunDevelopment
Copy link
Member

Thanks for reporting!

I fixed the first two examples in #2158, but the if one will be impossible to get correctly because Prism would have to detect that the parentheses belong to an if statement which isn't possible for us. Unfortunately, we can't allow the ) unconditionally because of expressions like let a = 5*(1+1)/3/m;.

We could try to parse the expression but this will be hard because recursive expressions are CF but we can try to approximate with a little effort and check as to whether the token before the expression is a statement keyword like if, for, etc and not something like await. The resulting pattern will then look something like this:

// the lookbehind is to ensure that we don't have a function name (e.g. foo_if() )
/(?<![$\w ...other chars])(?:if|for|while|...)\s*\((?:<comments>|<strings>|<other>|\(<recursive part>\))*\)\s*<regex>/

So it's approximately possible (we can only match a finite number of nested parentheses) it will also cause the regex pattern to be a long slower than it has to be just for one rare edge case.


I'll close the issue because as explained the first two issues will be resolved soon and the last one isn't worth it for just syntax highlighting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants