Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-copy encoding and decoding #18

Open
vincentdephily opened this issue Oct 25, 2019 · 4 comments
Open

Zero-copy encoding and decoding #18

vincentdephily opened this issue Oct 25, 2019 · 4 comments

Comments

@vincentdephily
Copy link
Collaborator

It might be possible to do some zero-copy encoding or decoding, especially of the publish payload. Bytes can do some refcounting behind the scene, so if we switch publish.payload from a Vec<u8> to a Bytes we should only have to create the slice, not its content.

We should revise whether encode() and decode() should take an impl IntoBuf and impl IntoBufMut rather than a straight BytesMut.

Last but not least, this all needs to be benchmarked.

@vincentdephily
Copy link
Collaborator Author

@00imvj00 Please take a look at vincentdephily@71627f9

I've spent some time trying to convert to impl Trait but have hit a few walls. We've got a few options:

  • Stop trying to be flexible, and embrace BytesMut everywhere.
  • Wait for (or help) the Bytes crate to switch the specified bugs.
  • Use the .bytes() workaround and document that only continuous-memory backends may be used (or if upstream creates a Trait for that, use it).
  • Use a different crate altogether, or reinvent the wheel.

I'm currently leaning towards option 2, but would like some opinions. I'd also be ok with throwing that commit away and go with option 1.

Note that one thing we need to do to get toward zero-copy is convert some Packet members (Connect.password, Publish.payload...) from Vec<u8> to Bytes (or use &[u8] and deal with the shared lifetime between Packet and its source/destination buffer). If that's where we're going, maybe it's silly to try to support other Buf backends.

@00imvj00
Copy link
Owner

For Zero-copy, instead of this, how about we keep references of offset?

For example: If the overall packet length is 64 bytes, we store these bytes as byte-array internally and then for the password, instead of a string, we just store something like, from index to index, and build method get_password() -> &str, where we will return the reference.

@vincentdephily
Copy link
Collaborator Author

A Bytes struct obtained from Bytes::slice*() or Bytes::split*() is pretty-much just a reference inside a &[u8], with some refcounting to figure out when the actual data can be dropped. I'm just not sure how it handles references to old data. Is it viable to keep using the same BytesMut to receive data from the network and then decode(), yielding a stream of Packet ? Or would holding a Byte from the connect packet force the buffer to grow forever ? Should we change the API to decode(Bytes) -> Result<Packet,Error> (forcing a .freeze()) to avoid that footgun ?

I'm going to dogfood that in different contexts and see, but it's going to take a while.

@00imvj00
Copy link
Owner

Yup, at this point we can experiment with couple of approaches and see which feels ergonomic.

Again, the goal is to keep is simple and make it super performant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants