Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Prepare tokenizers for stringMatching #3920

Merged
merged 12 commits into from
May 29, 2023
Merged

✨ Prepare tokenizers for stringMatching #3920

merged 12 commits into from
May 29, 2023

Conversation

dubzzz
Copy link
Owner

@dubzzz dubzzz commented May 29, 2023

In order to be able to implement a stringMatching arbitrary as requested in #2980, we first need to be able to understand a regex. Understanding a regex can be achieved by tokenizing it.

This first adds a basic tokenizer of regex that will be able to read a regex and translate it into an AST. This AST will be the entry point of our stringMatching. So far our tokenizer performs poorly for squared-bracket or parenthesis expressions and also unicode mode. But work is on-going to full support them.

Category:

  • ✨ Introduce new features
  • 📝 Add or update documentation
  • ✅ Add or update tests
  • 🐛 Fix a bug
  • 🏷️ Add or update types
  • ⚡️ Improve performance
  • Other(s): ...

Potential impacts:

  • Generated values
  • Shrink values
  • Performance
  • Typings
  • Other(s): ...

We initially not wanted to go for regex as they were too rich and thus would have requested lost of stuff to be implemented and carefully check, but as globs were not really designed for string matching topic, we went back to it.

Sorry, something went wrong.

dubzzz added 7 commits May 29, 2023 11:15
In order to be able to implement a `stringMatching` arbitrary as requested in #2980, we first need to be able to understand a regex. Understanding a regex can be achieved by tokenizing it.

This first adds a basic tokenizer of regex that will be able to read a regex and translate it into an AST. This AST will be the entry point of our `stringMatching`. So far our tokenizer performs poorly for squared-bracket or parenthesis expresssions. But work is on-going to full support them.

---

We initially not wanted to go for regex as they were too rich and thus would have requested lost of stuff to be implemented and carefully check, but as globs were not really designed for string matching topic, we went back to it.
@dubzzz dubzzz changed the title ✨ Prepare tokenizers for stringMatching ✨ Prepare tokenizers for stringMatching May 29, 2023
@codesandbox-ci
Copy link

codesandbox-ci bot commented May 29, 2023

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit 6f2a9c8:

Sandbox Source
Vanilla Configuration

dubzzz added 5 commits May 29, 2023 13:37

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@codecov
Copy link

codecov bot commented May 29, 2023

Codecov Report

Merging #3920 (6f2a9c8) into main (1287515) will decrease coverage by 0.24%.
The diff coverage is 89.83%.

@@            Coverage Diff             @@
##             main    #3920      +/-   ##
==========================================
- Coverage   95.16%   94.92%   -0.24%     
==========================================
  Files         205      207       +2     
  Lines        5314     5560     +246     
  Branches     1123     1230     +107     
==========================================
+ Hits         5057     5278     +221     
- Misses        241      266      +25     
  Partials       16       16              
Flag Coverage Δ
unit-tests 94.92% <89.83%> (-0.24%) ⬇️
unit-tests-14.x-Linux ?
unit-tests-16.x-Linux 94.92% <89.83%> (-0.24%) ⬇️
unit-tests-18.x-Linux 94.92% <89.83%> (-0.24%) ⬇️
unit-tests-latest-Linux 94.92% <89.83%> (-0.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...heck/src/arbitrary/_internals/helpers/ReadRegex.ts 88.13% <88.13%> (ø)
.../src/arbitrary/_internals/helpers/TokenizeRegex.ts 91.40% <91.40%> (ø)

@dubzzz dubzzz merged commit 768d96d into main May 29, 2023
@dubzzz dubzzz deleted the string-matching branch May 29, 2023 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant