Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: how to implement Source for lazy readers, like T: Read or T: BufRead #324

Open
jeertmans opened this issue Jun 28, 2023 · 2 comments
Labels
nice to have question Further information is requested

Comments

@jeertmans
Copy link
Collaborator

Hello all,

I'd like to open this discussion because, to me, it would be fascinating that Logos supports Source types others than str and [u8], especially lazy readers like those who implement Read or BufRead.

impl<T: Read> Source <T>, or impl<T: Read + Seek> Source <T> would be a game changer to me, as it would allow to lex some string without needing to allocate it completely.

I have tried a bit of different implementation, but I already see some shortcomings that need to be addressed or discussed:

  • Source::len() -> usize should maybe be Source::len_hint() -> Option<usize>
  • Source::read_* methods should take mutable reference to the reader (but it maybe will reduce performances for types like str or [u8] that do not need mutability).
  • unsafe methods do not really make sense here, so I don't know how to deal with them (except by copying and pasting the safe equivalent).
  • reading with offset position may not be good, especially since this may require using Seek::seek. If backtracking is never allowed, then using only read methods should be fine, no?
  • Tokens take a reference from the original source, so I don't know if implementing for Read is enough, because we may be loosing all reference to the original source. Implementing Source for Bytes may be a solution.

My question is then: did anyone already think about this problem? Has anyone some ideas or suggestions?

@jeertmans jeertmans added question Further information is requested nice to have labels Jun 28, 2023
@Mek101
Copy link

Mek101 commented Aug 4, 2023

  1. Shouldn't be too much of a blocker. Even when reading from disk a simple fstat or equivalent shouldn't be too expensive, especially if the result is cached.

@cliffeh
Copy link

cliffeh commented Jan 14, 2024

The changes required to make logos itself accept a mutable source are non-trivial, but this might be of interest to you if you're just looking for a way to leverage a logos lexer with a Read/BufRead:

https://github.com/cliffeh/logos-genawaiter/blob/main/src/main.rs

It's not exactly efficient and in its current form only works with line-wise input, but it should give some idea of the possibilities.

@makapuf makapuf mentioned this issue May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nice to have question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants