New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: outbound protobuf zero-copy take 2 #7369
base: master
Are you sure you want to change the base?
Conversation
16a62b2
to
5eb0dc6
Compare
Variation on grpc#7350. Slightly bigger but I think the interface changes are cleaner, avoids "borrowing" a ByteBuffer and separately informing of bytes written. Not sure which approach is preferable.
Hi @njhill. I see this and the "Random acts of garbage reduction" PR open, plus v1 of this PR still marked as a draft - would you like me to find someone to review these or is this still work-in-progress? Thanks! |
@ericgribkoff yes they are all ready for review... I un-drafted the other one. Thanks! |
CC @voidzcy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there's two different types of data, and you can only trivially have zero-copy output for one type at a time: protobuf primitives and ByteStrings. This provides zero-copy for protobuf primitives as they can be written directly to the output buffer when encoding.
But for ByteString zero-copy see CodedOutputStream.ByteOutputEncoder. That "allows" (not actually exposed) using the ByteString's buffers directly. It works in the opposite direction (similar to how Drainable is the opposite of InputStream). It would be possible to get zero-copy primitives as well, but it really needs some sort of buffer allocator and then you need buffer lifetimes. I think the allocator thing is why writeLazy exists in ByteOutput, as the "lazy" buffers are having their ownership passed to the ByteOutput. But the API is not quite enough to use Netty's buffer pool, I believe.
This seems pretty fair, although it will only benefit small messages. Zero-copy typically becomes more important for larger messages and then may need to consider segmentation. We would want to see some positive impact in TransportBenchmark. I wouldn't be surprised if most of the benefit is reduced GC.
I'd sort of be interested what would happen if we just added the WritableBuffer.write(InputStream)
method and used heap buffers with Netty instead of direct buffers. (Direct buffers for large things makes a lot of sense. For small things it is less clear.)
* @param dest the destination buffer to receive the bytes. | ||
* @throws BufferOverflowException if destination buffer has insufficient remaining bytes | ||
*/ | ||
int drainTo(ByteBuffer dest) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling the method "drain" would it be more apt to call it "read"? That would make the logic very similar to read(byte[] b, int off, int len)
.
I'm not wild about the BufferOverflowException and transferring of all bytes. I see little reason to limit the API to reading all the bytes and it means data in invalided if an exception is thrown. While having to call read() multiple times is annoying, it is that way for a reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree and this is actually what I started with for exactly the reasons you described, but I was thinking that the target impl doesn't actually work that way so this has the side effect that it will only be used when beneficial. And at least it's kind of similar to both ReadableBuffer#readBytes(ByteBuffer)
which requires the target buffer to be large enough to hold the entire source, and the Drainable
interface which has to drain the entire InputStream.
Having ProtoInputStream
fall back to use ByteArrayInputStream
in the case that the provided ByteBuffer
isn't big enough would be worse than the current path, and this would not be obvious from the interface. WDYT?
It's also why I changed the name to drainTo
, agree read
more appropriate if just filled the ByteBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that the target impl doesn't actually work that way so this has the side effect that it will only be used when beneficial
"Beneficial" is in the eye of the beholder. Yes, with the current code that is true. If this was in an API we could readily change, I'd agree. But since this is a public API that 1) will last a long time and 2) have multiple users I'd much rather favor the more generic method definition. I'm optimizing the API for longevity while making sure it allows optimizing the implementation.
As things change, my preference order would be:
- Any old code path is fast
- Any old code path works
- Any old code path fails (and this one is not actually acceptable)
Since drainTo(ByteBuffer)
would require KnownLength
, I'm not as fond of it. It could be useful to drain a compressed message to a ByteBuffer
or some such.
Interfaces generally don't define guaranteed performance levels. List.get() may be fast or slow. We will consider performance when making changes, but at times we may need to hurt performance of one particular method for some other goal. So I'm okay with the varying level of performance. If you are benchmarking then you don't care about performance, and such a regression would show up on a benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was in an API we could readily change, I'd agree.
How about just moving it to internal
package then (alongside WritableBuffer
)?
Probably this is clear but I meant that performance-wise the only implementation of this would actually involve more copying than less in the case that the stream didn't fit in the buffer i.e. worse than it is now and having the opposite than intended effect of this PR.
Thanks @ejona86 that's interesting... I did look fleetingly at I don't think writeLazy is really related to ownership transfer as such, just means that the buffer is immutable and so safe to hang on to for reading later.
I think this could be done without buffer lifetimes with a new CodedOutputStream impl, unfortunately it can't be subclassed externally :( Otherwise, maybe a ByteOutput-based version could be used selectively based on the proportion of the message made up of large ByteStrings or if the allocated ByteBuffer isn't large enough. Maybe I'll try to add that.
Up to 1MiB right? which doesn't seem too bad. And for larger it should no worse than status quo. |
@@ -54,5 +64,5 @@ | |||
* Releases the buffer, indicating to the {@link WritableBufferAllocator} that | |||
* this buffer is no longer used and its resources can be reused. | |||
*/ | |||
void release(); | |||
void close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in this interface and MessageFramer aren't directly related to the goal of this PR. Also, I'd vote for close()
as the mirror of ReadableBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean changes like s/release/close/g, casting int
to byte
as well as adding the len > 0
check to the existing write-to-array method in MessageFramer, are sort of excessive regarding to the goal of this PR. They are irrelevant to things intended to be added. Also, close()
would be preferable compared to release()
as it's more abstract and ReadableBuffer
also uses close()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close()
would be preferable compared torelease()
as it's more abstract andReadableBuffer
also usesclose()
You seem to be arguing for both sides of this in the same paragraph? :)
changes like s/release/close/g, casting
int
tobyte
I just explained these in the preceding comment, they are in fact linked to this PR.
adding the
len > 0
check
Sure I can remove this if you prefer
int writable = bytebuf.writableBytes(); | ||
int written = bytebuf.writeBytes(stream, writable); | ||
if (written == writable && stream.available() != 0) { | ||
throw new BufferOverflowException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to check writable capacity at this level? Since the WritableBuffer is a wrapper backed by some real buffer and all the operations are delegated to the underlaying buffer, we should just let the underlaying buffer handle the case and such an exception should be thrown from the underneath.
Variation on #7350. Slightly bigger but I think the interface changes are cleaner, avoids "borrowing" a ByteBuffer and separately informing of bytes written. Not sure which approach is preferable.