Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backreference language sampling #24

Open
RunDevelopment opened this issue Feb 25, 2021 · 0 comments
Open

Backreference language sampling #24

RunDevelopment opened this issue Feb 25, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@RunDevelopment
Copy link
Owner

The JS parser is able to resolve backreferences where (among other conditions) the associated capturing group accepts a small finite language.

E.g. /(a|b)\1/ will be parsed as /aa|bb/

This is quite useful but it cannot handle capturing groups that accept infinitely many strings.

E.g. /(=+)a\1/ cannot be parsed.

However, in some use cases (such as static analysis), it might be enough to replace the backreferences with a sample of the language of the capturing group. The parse result will only approximate the input RegExp but this may be good enough for some use cases.

E.g. /(=+)a\1/ might get parsed as /=a=|==a==|===a===|====a====/ depending on the sampling. This isn't correct but it might be useful.

The sampling algorithm has to be provided by the user. The solution will always be imperfect and refa can't know which tradeoffs are acceptable.

@RunDevelopment RunDevelopment added the enhancement New feature or request label Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant