Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C# ByteString, CodedInputStream performance improvements desired for messages with large byte array field #4206

Closed
d79ima opened this issue Jan 21, 2018 · 17 comments · Fixed by #7645
Labels

Comments

@d79ima
Copy link

d79ima commented Jan 21, 2018

Currently if you have a message with an array of bytes field that translates into a ByteString immutable object.
ByteString can't be created without incurring a GC memory allocation and an array copy.
When you deserialize a ByteString using CodedInputStream, the same penalty is paid.
Furthermore, when you then go to consume the bytes you received you pay this same penalty yet again because of ByteString.ToByteArray().

In my application these penalties are adding up to excessive GC pressure and memory copies which is reducing my message throughput substantially. This leaves me some unatractive options for my .NET C# application, either use C++ protobuffs communication layer or send my large byte array messages natively without using protobuffs.

It would be really nice if you allowed an option to have more fine grained control over the "bytes" field such that the code generator would spit out just a native byte[] type field or a new class (i.e. UnsafeByteString) which bypasses the above mentioned penalties and user takes on full responsibility for the fact that it is a mutable unsafe byte array.
CodedInputStream also needs to provide an option for user to inject his own byte array allocator so that user application can have fine grained control over byte array memory allocations to minimize GC penalties.

I think this is really important to allow for truly high performance C# protobuf apps.

@jtattermusch
Copy link
Contributor

Somewhat related gRPC issue: grpc/grpc#14378

@wanton7
Copy link

wanton7 commented May 23, 2018

I would want to have support for unsafe ByteString that would support setting it's length as well. When sending chunked data, last chunk is usually smaller.

@d79ima are you are aware that if you allocate 85000 bytes or more in .NET Framework allocation goes directly to LOH (Large Object Heap) and skips GC generations? It will add lot of GC pressure and LOH can only be compacted manually. So if you want to send binary data it should done in less than 85000 byte chunks.

@dpsenner
Copy link

I'm seconding this wish for a garbage free api.

@smflorentino
Copy link

+1, we are observing the same problem. How do we move forward on this?

@jtattermusch
Copy link
Contributor

#5888

@drew-512
Copy link

drew-512 commented Jun 26, 2019

A simple enhancement that would address most of this issue is if the C# gpc library had a way to pool no longer needed ByteString (like Go's sync.Pool) and so then could be internally reused. This would prevent the crazy firehose of ByteStrings constantly being created that the GC has to chase.

If I were implementing it, I'd make 4-8 size ByteStrings pools/buckets internally so the internal faux-allocator chooses from a pool such that waste isn't excessive. And there could be a manual polling call to drain them or a periodic task that slowly drains 1-5% of each bucket every 10-60 secs. That way no timestamps/tracking is needed, it covers the lion's share of cases, and there are no freak/worst-case scenarios. That extension could be added and tested in a couple days by one person. I'd volunteer to do if someone here told me there was a good chance of the enhancement being picked up.

Please help us Protobuf heroes! We absolutely love protobufs and we're astonished they're not used more. I do think that will change as interoperability is recognized as more important.

Drew

@jtattermusch
Copy link
Contributor

I believe the way forward is implementing #5888 (which allows parsing messages directory from a ReadOnlySequence).

For avoiding excessive allocations in ByteString, that's a separate problem and we'd need to come up with a separate design. The main problem of using array pools with ByteString is that ByteString wasn't designed to be disposable, so there's no way of expressing that the rented array can be returned to the pool.

@drew-512
Copy link

drew-512 commented Jun 28, 2019

Hi!

Maybe I'm not following correctly, but my vision was that a gpb consumer would explicitly drop off ByteStrings that are known to be no longer used (Return() in C# parlance). For example, we are processing a firehose of messages all the time loaded w/ ByteStrings, but after we unmarshal them, we would return those (so grpc/gpb could reuse them). Sure, that reuse will waste some here and there (as an alloc request needs 800 and the bucket it draws from the 1024 bucket), but w/ that bucket approach, most waste is addressed and the GC workload would vanish. It's a totally worthy trade imo to drop the GC workload in exchange for ByteStrings wasting part of their allocation. Basically, if we all agree a ByteString is basically a temp/scrap object, then lets treat them that way and assume ppl will extract their data and Return them to the pool if they care about performance.

Sure, I love the idea of 5888, but it seems to be a larger bolder ask of development, whereas adding a ByteString Return() call that funnels out into 6-12 ArrayPool buckets seems doable, no-risk, and poses no burden on people who don't use the enhancement.

Hopefully I'm making sense or maybe I'm missing what you're saying.

@dpsenner
Copy link

Would something similar to Stream.Read(byte[], int, int) work?

https://docs.microsoft.com/en-us/dotnet/api/system.io.stream.read

@jtattermusch
Copy link
Contributor

I'm having trouble imagining some API additions for ByteString that would make it possible to manage them more efficiently:

  • the most efficient future way of parsing protobufs is going to be from "ReadOnlySequence" (as mentioned above), but ReadOnlySequence is an ephemeral data source and in many implementations, it will become invalid once parsing the message is done (so we cannot remember the "reference" for the input data and make ByteString read from there).
  • currently ByteString is not disposable, so renting an array from a pool and returning it back once not needed is problematic, because it's not clear when to return the rented array from the pool and also messages are recursive and they don't have a concept of disposing (and traversing messages to dispose all the byteStrings seems like a very clumsy approach).

Because ReadOnlySequence is ephemeral, the only approach I could see is somehow making ByteString parsing customizable (an advanced API) and optionally make ByteString only remember a Slice of the input ReadOnlySequence (which will become invalid as soon as the input ROS becomes invalid). After that, the user would need to make sure to extract the data from byteString themselves before the input ROS is invalidated. This could be useful in some high-performance scenarios, but it's sound like an api that's very specialized and hard to use, so I'm not a fan.

@JamesNK
Copy link
Contributor

JamesNK commented Dec 19, 2019

I think there are two scenarios for creating ByteString:

  1. Deserialization via CodedInputStream/CodedInputReader. The ByteString is being created from a byte[]/ReadOnlySequence<byte> buffer. Because the buffer is owned by someone else, e.g. web server, and it is likely to be reused, we have to create a copy of the data inside the ByteString. The best way to avoid allocations here is using an array pool, but there is no way to track message lifetime. A possible way to solve this problem is to provide an array lifetime management API to the CodedInputXXX. It will handle providing arrays used by ByteStrings it creates out of a pool. Those arrays are added to a list. At the end of its lifetime scope, e.g. a web request, the lifetime management API will release all the arrays in the list.

  2. Manually creating the ByteString for serialization via CodedOutputStream/CodedOutputWriter. In this situation the developer owns the byte array, and can choose how to create the ByteString. To avoid allocations/copies I think there should be a new way to create ByteString: ByteString.FromMemory(ReadOnlyMemory<byte> bytes). This will simply set an internal ROM<byte> field. There are existing internal methods like this on ByteString, although they are internal:

/// <summary>
/// Unsafe operations that can cause IO Failure and/or other catastrophic side-effects.
/// </summary>
internal static class Unsafe
{
/// <summary>
/// Constructs a new ByteString from the given byte array. The array is
/// *not* copied, and must not be modified after this constructor is called.
/// </summary>
internal static ByteString FromBytes(byte[] bytes)
{
return new ByteString(bytes);
}
}
/// <summary>
/// Internal use only. Ensure that the provided array is not mutated and belongs to this instance.
/// </summary>
internal static ByteString AttachBytes(byte[] bytes)
{
return new ByteString(bytes);
}

To emphasis its unsafe nature, the name could alternatively be ByteString.Unsafe.FromMemory(ReadOnlyMemory<byte> bytes).

Adding this new way of creation would sacrifice the current safety ByteString has of it always being immutable. I don't know if this is important or not. If it is, perhaps ByteString could have a flag on it to identify whether it is an "unsafe" or not?


#1 complex to do.
#2 is pretty simple.

@davidfowl @JunTaoLuo @shirhatti

@davidfowl
Copy link

It's it possible to codegen a different type besides ByteString for the unsafe "optimized" variant?

@kbegiedza
Copy link

Is there any progress / workaround for this issue?

I belive that, as @JamesNK mentioned, new public Unsafe method should solve this problem.

cc @jtattermusch

@JamesNK
Copy link
Contributor

JamesNK commented Oct 6, 2020

I wrote a PR that allows a ByteString to be created without copying the data. It matches what is available in Java - #7645

No progress on merging it right now.

@Alois-xx
Copy link

Alois-xx commented Nov 2, 2020

When will the change arrive? For gRPC messages with large payloads (read images) a GC friendly design is a must.

@kbegiedza
Copy link

@Alois-xx still no ETA, you can check related PR: #7645

@tactical-drone
Copy link

We need this please.

codenjwu pushed a commit to codenjwu/EventDrivenCapture that referenced this issue Feb 28, 2021
…improvement for fixing memiry spikes, looks this is a known issue for bytestring. bytesring is not designed for disposeble, refer to: protocolbuffers/protobuf#4206
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.