Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for comments in lrlex files #403

Open
FranklinChen opened this issue May 13, 2023 · 3 comments
Open

Add support for comments in lrlex files #403

FranklinChen opened this issue May 13, 2023 · 3 comments

Comments

@FranklinChen
Copy link

I started using lrlex and didn't find a formal definition of its file format but it seems there is no way to write comments in .l files. I think that would be a useful feature to support.

@ratmice
Copy link
Collaborator

ratmice commented May 13, 2023

Generally lrlex has focused on following the posix lex specification (but hasn't implemented comments yet if I recall),
my reading I didn't see any specific documentation of the comment format in lex, it is a little awkward, but generally mimics c-style comments but must be preceded by whitespace to avoid ambiguity with regexes. Except in cases where it is just a block of c code that gets copied verbatim.

Here are some examples, the latter showing some of the cases where initial whitespace is not required:
https://cs.gmu.edu/~henryh/330/Lex/comments.html
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/lex.html#tag_20_65_17

@ltratt
Copy link
Member

ltratt commented May 13, 2023

Agreed, it would be good if lrlex supported comments, and I'd happily take a PR to that effect! If flex supports them, I might also be inclined to support // comments, but I don't feel strongly about it.

@ratmice
Copy link
Collaborator

ratmice commented Aug 22, 2023

So, I had noticed the text in #325 the following text in the posix lex spec:

Any such input (beginning with a or within "%{" and "%}" delimiter lines) appearing at the beginning of the Rules section before any rules are specified shall be written to lex.yy.c

Which is relevant to this bug, it would be easy enough to currently just ignore any line which starts with a space.
It seems like if we tried to emit these verbatim into generated sources, they would be in the middle of a vec![ Rule::new(), ..].
In order to actually emit them we'd need to change uses of struct Rule to something like enum RuleOrVerbatim<StorageT>{ Verbatim(String), Rule(Rule<StorageT>)}, as currently there isn't anywhere for them in our AST of the lex source format.

I believe Rule is public but #[doc(hidden)] and otherwise documented as unstable, so perhaps changing it to an enum is acceptable. But let me know if there are preferences here between emitting these verbatim or ignoring them entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants