POC: outbound protobuf zero-copy take 2 #7369

njhill · 2020-08-27T18:04:45Z

Variation on #7350. Slightly bigger but I think the interface changes are cleaner, avoids "borrowing" a ByteBuffer and separately informing of bytes written. Not sure which approach is preferable.

Variation on grpc#7350. Slightly bigger but I think the interface changes are cleaner, avoids "borrowing" a ByteBuffer and separately informing of bytes written. Not sure which approach is preferable.

ericgribkoff · 2020-08-28T18:08:22Z

Hi @njhill. I see this and the "Random acts of garbage reduction" PR open, plus v1 of this PR still marked as a draft - would you like me to find someone to review these or is this still work-in-progress? Thanks!

njhill · 2020-08-28T18:22:35Z

@ericgribkoff yes they are all ready for review... I un-drafted the other one. Thanks!

ejona86 · 2020-08-28T19:32:01Z

CC @voidzcy

ejona86

So there's two different types of data, and you can only trivially have zero-copy output for one type at a time: protobuf primitives and ByteStrings. This provides zero-copy for protobuf primitives as they can be written directly to the output buffer when encoding.

But for ByteString zero-copy see CodedOutputStream.ByteOutputEncoder. That "allows" (not actually exposed) using the ByteString's buffers directly. It works in the opposite direction (similar to how Drainable is the opposite of InputStream). It would be possible to get zero-copy primitives as well, but it really needs some sort of buffer allocator and then you need buffer lifetimes. I think the allocator thing is why writeLazy exists in ByteOutput, as the "lazy" buffers are having their ownership passed to the ByteOutput. But the API is not quite enough to use Netty's buffer pool, I believe.

This seems pretty fair, although it will only benefit small messages. Zero-copy typically becomes more important for larger messages and then may need to consider segmentation. We would want to see some positive impact in TransportBenchmark. I wouldn't be surprised if most of the benefit is reduced GC.

I'd sort of be interested what would happen if we just added the WritableBuffer.write(InputStream) method and used heap buffers with Netty instead of direct buffers. (Direct buffers for large things makes a lot of sense. For small things it is less clear.)

ejona86 · 2020-08-28T19:40:43Z

api/src/main/java/io/grpc/BufferDrainable.java

+   * @param dest the destination buffer to receive the bytes.
+   * @throws BufferOverflowException if destination buffer has insufficient remaining bytes
+   */
+  int drainTo(ByteBuffer dest) throws IOException;


Instead of calling the method "drain" would it be more apt to call it "read"? That would make the logic very similar to read(byte[] b, int off, int len).

I'm not wild about the BufferOverflowException and transferring of all bytes. I see little reason to limit the API to reading all the bytes and it means data in invalided if an exception is thrown. While having to call read() multiple times is annoying, it is that way for a reason.

I don't disagree and this is actually what I started with for exactly the reasons you described, but I was thinking that the target impl doesn't actually work that way so this has the side effect that it will only be used when beneficial. And at least it's kind of similar to both ReadableBuffer#readBytes(ByteBuffer) which requires the target buffer to be large enough to hold the entire source, and the Drainable interface which has to drain the entire InputStream.

Having ProtoInputStream fall back to use ByteArrayInputStream in the case that the provided ByteBuffer isn't big enough would be worse than the current path, and this would not be obvious from the interface. WDYT?

It's also why I changed the name to drainTo, agree read more appropriate if just filled the ByteBuffer.

I was thinking that the target impl doesn't actually work that way so this has the side effect that it will only be used when beneficial

"Beneficial" is in the eye of the beholder. Yes, with the current code that is true. If this was in an API we could readily change, I'd agree. But since this is a public API that 1) will last a long time and 2) have multiple users I'd much rather favor the more generic method definition. I'm optimizing the API for longevity while making sure it allows optimizing the implementation.

As things change, my preference order would be:

Any old code path is fast

Any old code path works

Any old code path fails (and this one is not actually acceptable)

Since drainTo(ByteBuffer) would require KnownLength, I'm not as fond of it. It could be useful to drain a compressed message to a ByteBuffer or some such.

Interfaces generally don't define guaranteed performance levels. List.get() may be fast or slow. We will consider performance when making changes, but at times we may need to hurt performance of one particular method for some other goal. So I'm okay with the varying level of performance. If you are benchmarking then you don't care about performance, and such a regression would show up on a benchmark.

If this was in an API we could readily change, I'd agree.

How about just moving it to internal package then (alongside WritableBuffer)?

Probably this is clear but I meant that performance-wise the only implementation of this would actually involve more copying than less in the case that the stream didn't fit in the buffer i.e. worse than it is now and having the opposite than intended effect of this PR.

njhill · 2020-08-29T06:30:40Z

Thanks @ejona86 that's interesting... I did look fleetingly at ByteOutputEncoder but disregarded when I saw it was buffering. I see what you mean about the primitive vs byte chunk cases though. Not based on much but I would guess large strings are more common than large ByteStrings, and this approach should be better for those. In any case it's still at least as "zero-copy" as the inbound case... just that we could do even better for outbound with ByteStrings.

I don't think writeLazy is really related to ownership transfer as such, just means that the buffer is immutable and so safe to hang on to for reading later.

It would be possible to get zero-copy primitives as well, but it really needs some sort of buffer allocator and then you need buffer lifetimes.

I think this could be done without buffer lifetimes with a new CodedOutputStream impl, unfortunately it can't be subclassed externally :( Otherwise, maybe a ByteOutput-based version could be used selectively based on the proportion of the message made up of large ByteStrings or if the allocated ByteBuffer isn't large enough. Maybe I'll try to add that.

it will only benefit small messages

Up to 1MiB right? which doesn't seem too bad. And for larger it should no worse than status quo.

voidzcy · 2020-09-01T18:02:27Z

core/src/main/java/io/grpc/internal/WritableBuffer.java

@@ -54,5 +64,5 @@
   * Releases the buffer, indicating to the {@link WritableBufferAllocator} that
   * this buffer is no longer used and its resources can be reused.
   */
-  void release();
+  void close();


Changes in this interface and MessageFramer aren't directly related to the goal of this PR. Also, I'd vote for close() as the mirror of ReadableBuffer.

@voidzcy could you clarify which changes specifically you're talking about in each? Obviously write(InputStream) is part of it. The other two methods in this interface are to make it compatible with OutputStream so that impls can impl both if useful, which NettyWritableBuffer now does, used here.

I mean changes like s/release/close/g, casting int to byte as well as adding the len > 0 check to the existing write-to-array method in MessageFramer, are sort of excessive regarding to the goal of this PR. They are irrelevant to things intended to be added. Also, close() would be preferable compared to release() as it's more abstract and ReadableBuffer also uses close().

close() would be preferable compared to release() as it's more abstract and ReadableBuffer also uses close()

You seem to be arguing for both sides of this in the same paragraph? :)

changes like s/release/close/g, casting int to byte

I just explained these in the preceding comment, they are in fact linked to this PR.

adding the len > 0 check

Sure I can remove this if you prefer

voidzcy · 2020-09-01T18:12:05Z

netty/src/main/java/io/grpc/netty/NettyWritableBuffer.java

+    int writable = bytebuf.writableBytes();
+    int written = bytebuf.writeBytes(stream, writable);
+    if (written == writable && stream.available() != 0) {
+      throw new BufferOverflowException();


Do we really need to check writable capacity at this level? Since the WritableBuffer is a wrapper backed by some real buffer and all the operations are delegated to the underlaying buffer, we should just let the underlaying buffer handle the case and such an exception should be thrown from the underneath.

njhill force-pushed the zero-out-2 branch 2 times, most recently from 16a62b2 to 5eb0dc6 Compare August 28, 2020 03:43

POC: outbound protobuf zero-copy take 2

380a3e4

Variation on grpc#7350. Slightly bigger but I think the interface changes are cleaner, avoids "borrowing" a ByteBuffer and separately informing of bytes written. Not sure which approach is preferable.

njhill force-pushed the zero-out-2 branch from 5eb0dc6 to 380a3e4 Compare August 28, 2020 13:59

ericgribkoff requested a review from ejona86 August 28, 2020 18:53

ericgribkoff added the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Aug 28, 2020

grpc-kokoro removed the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Aug 28, 2020

ejona86 reviewed Aug 28, 2020

View reviewed changes

voidzcy reviewed Sep 1, 2020

View reviewed changes

voidzcy mentioned this pull request Sep 16, 2020

Interfaces for supporting zero copy into Protobuf (HasByteBuffer & Detachable) #7387

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: outbound protobuf zero-copy take 2 #7369

POC: outbound protobuf zero-copy take 2 #7369

njhill commented Aug 27, 2020

ericgribkoff commented Aug 28, 2020

njhill commented Aug 28, 2020

ejona86 commented Aug 28, 2020

ejona86 left a comment

ejona86 Aug 28, 2020

njhill Aug 29, 2020

ejona86 Sep 1, 2020

njhill Sep 1, 2020

njhill commented Aug 29, 2020

voidzcy Sep 1, 2020

njhill Sep 1, 2020

voidzcy Sep 1, 2020

njhill Sep 1, 2020

voidzcy Sep 1, 2020

POC: outbound protobuf zero-copy take 2 #7369

Are you sure you want to change the base?

POC: outbound protobuf zero-copy take 2 #7369

Conversation

njhill commented Aug 27, 2020

ericgribkoff commented Aug 28, 2020

njhill commented Aug 28, 2020

ejona86 commented Aug 28, 2020

ejona86 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njhill commented Aug 29, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment