Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to decode from byte stream? #517

Open
Evian-Zhang opened this issue Feb 9, 2024 · 5 comments
Open

Is there a way to decode from byte stream? #517

Evian-Zhang opened this issue Feb 9, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@Evian-Zhang
Copy link

When implementing an emulator like QEMU (not KVM mode), there is a decoding stage where instructions are decoded to determine what to do with this instruction, and iced-x86 is the fastest decoder I have ever met, which is the best choice to do this work.

However, it is hard to extract a &[u8] slice which contains the instruction bytes in a common emulator's framework, where the memory itself is also emulated. Though we CAN access any address with any length by issuing emulated page faults and coping the memory values to a temp buffer, and I do know that x86 instruction has a max length (16 maybe?), it may be not a best practice to always get a slice of bytes with this maximum length, since there will always be some bytes unnecessary.

In QEMU, its self-implemented instruction decoder (you can see it here) uses APIs like x86_ldub_code to get bytes to decode, which means "load unsigned byte in code section". This pattern may be more appropriate in this situation, and I wonder if iced-x86 could have a stream decoder which does the same thing, i.e., not directly access a slice, but call .next() on an iterator of u8 to get next byte or peek the next byte.

@weltkante
Copy link

weltkante commented Feb 9, 2024

Whats preventing you to fetch a bunch of bytes into a buffer and process that? Its usually the most efficient way to handle things (and actually what most stream implementations are doing anyways), so using a stream is just a wrapper around the buffered read concept, something you can easily implement yourself, and not providing any performance benefit on top of that.

PS: I'm not related to this repo, just subscribed to it.

@Evian-Zhang
Copy link
Author

@weltkante Thank you for your quick response:)

From my experience dealing with emulated memory model, I cannot come up with an "easily"-implemented buffered reader of bytes in emulated memories. A &[u8] must provide abilities to get total length, access elements at any index, and make sure that these operations have good performance. To do so, we must copy 16 bytes from the emulated memory (by copying rather than referencing, this is because the backend may be not a real "memory" thing), decode it, get the updated ip, and discard remained bytes, and re-copy the next 16 bytes from the emulated memory, and always deal with page fault. It is not as easy as the QEMU-used APIs, i.e., only get needed memories byte-by-byte, without losing any performance for discarded bytes.

I admit that the above explanation is hard to understand, I'm happy to explain any details I did not explain clearly.

@weltkante
Copy link

weltkante commented Feb 9, 2024

I don't think it'll solve your performance issues if you can't do bulk data transfers, but ignoring that it sounds like you could just implement CodeReader subclass and implement its ReadByte override? or use StreamCodeReader? Then use a Decoder.Create overload taking a CodeReader?

[edit] thinking about, you didn't mention which language you're using, for C# the solution seems so obvious that you're probably using a different version of the API?

@Evian-Zhang
Copy link
Author

Thank you for notifying me this API. Yes, StreamCodeReader is a perfect API for my problem. However, I'm using Rust, and the Rust version doesn't have such API. Strange...

@wtfsck
Copy link
Member

wtfsck commented Feb 9, 2024

Yes it's not possible at the moment with the Rust version but something that's useful so will add it to my to do list.

@wtfsck wtfsck added the enhancement New feature or request label Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants