Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function for which characters an FSM's input must contain to match #462

Open
silentbicycle opened this issue Apr 22, 2024 · 0 comments

Comments

@silentbicycle
Copy link
Collaborator

silentbicycle commented Apr 22, 2024

It would be useful to have a function that populates a u64bitset with which characters must appear in the input to possibly match an FSM. This wouldn't track the relative order of those characters occurring, just that if bitset[N] is set and character/octet N does not appear in the input at all, it can never match.

This could be based on either the regex AST or walking the FSM. With the FSM it would benefit from fsm_trims's shortest_end_distance analysis -- I think the intersection of the bitsets from all single-label edges on the shortest reverse path(s) from every endstate to the start would be one way to do it (for paths that cannot be skipped via a longer route), though not necessarily the simplest/cheapest.

@silentbicycle silentbicycle changed the title Add function for which characters FSMs's input must contain to match Add function for which characters an FSM's input must contain to match Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant