Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectored I/O with datagrams #1724

Open
kixelated opened this issue Dec 11, 2023 · 6 comments
Open

Vectored I/O with datagrams #1724

kixelated opened this issue Dec 11, 2023 · 6 comments

Comments

@kixelated
Copy link

kixelated commented Dec 11, 2023

I'm adding datagram support to webtransport-quinn. WebTransport uses a session_id VarInt in front of each datagram to disambiguate sessions.

My plan goal is to mirror the Quinn API but have the session_id prefix transparent to the application, something like:

pub async fn send_datagram(&self, data: Bytes) -> Result<(), SessionError> {
    // Unfortunately, we need to allocate/copy each datagram because of the Quinn API.
    let mut buf = BytesMut::with_capacity(self.header_datagram.len() + data.len());

    buf.copy_from_slice(&self.header_datagram); // pre-encoded VarInt
    buf[self.header_datagram.len()..].copy_from_slice(&data);

    self.conn.send_datagram(buf.into())?;
    Ok(())
}

Unfortunately, with the current Quinn API it's impossible to avoid allocating a copy of every datagram.

  • Bytes must be contiguous in memory.
  • Bytes must be on the heap, so I can't make a copy on the stack.
  • Bytes references are held, so I can't reuse the same BytesMut to avoid allocating each time.

The core issue is that there's no vectored IO for datagrams. This wouldn't be a problem with the stream API because you can call write multiple times, but it's a problem for the datagram API because you need the entire payload up front.

@Ralith
Copy link
Collaborator

Ralith commented Dec 11, 2023

I'm not sure allocating a separate Bytes for a VarInt tag will actually improve performance vs. copying, since datagrams have to be fairly small to begin with. Do you have any data?

@kixelated kixelated changed the title send_datagram_chunks Vectored I/O with datagrams Dec 12, 2023
@kixelated
Copy link
Author

I'm not sure allocating a separate Bytes for a VarInt tag will actually improve performance vs. copying, since datagrams have to be fairly small to begin with. Do you have any data?

Sorry, I updated the description with some more information and less of a proposal.

The issue is that I need to allocate a copy of each datagram since there's no way to prepend/append otherwise. I don't have any performance numbers but this would undoubtedly be an issue at high throughput.

@djc
Copy link
Collaborator

djc commented Dec 12, 2023

I think this is already supported with the current API? A Bytes is:

A cheaply cloneable and sliceable chunk of contiguous memory.

That is, it should be possible to allocate a single Bytes for holding all your datagrams, then split it up and pass each part (which is really a view into the same underlying allocation) into Datagrams::send().

@kixelated
Copy link
Author

kixelated commented Dec 12, 2023

I think this is already supported with the current API? A Bytes is:

A cheaply cloneable and sliceable chunk of contiguous memory.

That is, it should be possible to allocate a single Bytes for holding all your datagrams, then split it up and pass each part (which is really a view into the same underlying allocation) into Datagrams::send().

Not quite the same issue. The key thing is that I'm writing a library, so I don't actually create the datagram payload. The application creates the datagram payload and my library prefixes it with the WebTransport header before sending.

You're right that the application could encode the session_id header into each datagram, but I really don't want the application to know anything about WebTransport internals. It's meant to be transparent to the application so the WebTransport API is identical to QUIC.

A similar issue is fragmentation. Let's say I have a 1MB video Bytes blob that I want to fragment into datagrams. You're right that slicing is free, however that doesn't work since each datagram needs some sort of header (ex. sequence number), otherwise reassembly is impossible. You run into the same problem and need to allocate a copy of each datagram on the heap just to encode a small header.

Meanwhile, the streams API lets you avoid this. You can do two separate writes: one for the header and one for the Bytes payload. Or you can use a single write_all_chunks call.

@Ralith
Copy link
Collaborator

Ralith commented Dec 12, 2023

Yeah, that generally makes sense. I'm curious how much the extra API complexity would actually buy us, though. Streams can have much larger individual writes, though I guess in either case you might be sending up to at most the link capacity, so maybe it's not so different...

I'm open to having this.

@kixelated
Copy link
Author

Yeah I don't think there's any urgency for this API. I'm not using datagrams and actively advocate against them.

But this would be a blocker for using datagrams at scale. Allocating on the heap for every packet gets expensive. I'm just going to leave a comment in my library and link to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants