Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled TOK_ERROR for incompletely parsed unsupported backtracking control flags #386

Open
silentbicycle opened this issue Oct 3, 2022 · 1 comment
Labels

Comments

@silentbicycle
Copy link
Collaborator

silentbicycle commented Oct 3, 2022

The PCRE dialect's lexer and parser hit an assertion with the input (*::

$ build/bin/re -rpcre '(*:'
re: src/libre/parser.act:1108: struct ast *parse_re_pcre(re_getchar_fun *, void *, const struct fsm_options *, enum re_flags, int, struct re_err *): Assertion `!"unreached"' failed.

This appears to be because the lexer expects a ) to close the backtracking control flags, but if the '(*' .. ')' {} in lx does not get the closing ')' it just returns TOK_ERROR, and the parser does not have a catch-all error handler at the right layer -- the very first thing the generated parser's point of entry does is return if the current terminal is the error token. This misses the other error handlers, so the assertion that if parsing did not produce an expression, an error code is set fails shortly after.

I spent a bit trying to flag the error properly in the parser, but I'm not that familiar with sid. Adding a TOK_ERROR check in parser.act between ADVANCE_LEXER; and DIALECT_ENTRY catches it, but seems hacky. There also doesn't seem to be a clearly appropriate error code -- RE_EXSUB ("expected sub-expression") may be the closest.

@silentbicycle
Copy link
Collaborator Author

Similarly, the intermediate steps for other multi-character tokens with .., will trigger the same assert -- (*C (for (*CR), (*P, (*pos, (?<, and so on. A catch-all error hander for TOK_ERROR should address all of these.

silentbicycle added a commit to fastly/libfsm that referenced this issue Feb 16, 2023
See katef#386 on katef/libfsm.

This is a workaround for a bug in the parser -- once the fuzzer
finds it, it tends to get in the way of finding deeper issues.
silentbicycle added a commit that referenced this issue Apr 21, 2023
See #386 on katef/libfsm.

This is a workaround for a bug in the parser -- once the fuzzer
finds it, it tends to get in the way of finding deeper issues.
silentbicycle added a commit that referenced this issue Apr 24, 2023
See #386 on katef/libfsm.

This is a workaround for a bug in the parser -- once the fuzzer
finds it, it tends to get in the way of finding deeper issues.
katef pushed a commit that referenced this issue Apr 24, 2023
See #386 on katef/libfsm.

This is a workaround for a bug in the parser -- once the fuzzer
finds it, it tends to get in the way of finding deeper issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant