Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C#, add ByteString.FromStream(MemoryStream stream) overload, to avoid allocating twice #7191

Closed
frabe1579 opened this issue Feb 8, 2020 · 9 comments
Labels

Comments

@frabe1579
Copy link

Sometimes data is constructed from a MemoryStream, but calling ByteString.FromStream passing that stream allocates a new memory stream. We know that we cannot directly attach a byte array to a ByteString to preserve data isolation and avoid external mutations, but maybe we can create a FromStream overload that accepts a MemoryStream and attach the result of ToArray directly to the ByteStream, like this:
public static ByteString FromStream(MemoryStream stream) { ProtoPreconditions.CheckNotNull(stream, "stream"); return AttachBytes(stream.ToArray()); }
This preserves data isolation and avoids external mutations. As from documentation, MemoryStream.ToArray always returns a new array.

@jtattermusch
Copy link
Contributor

Can you point out the code locations/situations where ByteString is constructed from a memory stream the way you describe? I do know that we do this for tests (where it's completely fine), but I'm not sure if we do it in places where it matters.

@frabe1579
Copy link
Author

I mean in user code, not in source code of this repo.
Consider this example:
message DocumentData { int32 id = 1; bytes pdfData = 2; }
When you want to send that message, you have to create a ByteString and assign it to pdfData.
In a real scenario I create a small pdf at runtime, I write it to a MemoryStream, and then I create a ByteString using FromStream static method, with that memory stream as argument. Finally I assign the new ByteString to property pdfData. ByteString.FromStream uses another MemoryStream, (doubling the allocation of memory.

Another similar problem is with Guid struct mapped to bytes, where Guid.ToByteArray() is used to extract data, which is then passed to ByteArray.CopyFrom, but this will be eventually another issue.

@anon17
Copy link

anon17 commented Feb 13, 2020

Or add an overload to read a stream in chunks:

public static ByteString FromStream(Stream stream, int size)
{
    byte[] buffer = new byte[size];
    int r = stream.Read(buffer, 0, buffer.Length);
    return AttachBytes(buffer);
}

This can be used with any stream.

@anon17
Copy link

anon17 commented Feb 13, 2020

If in that scenario the pdf file becomes bigger than the maximum message size, he will need to split it anyway.

@frabe1579
Copy link
Author

@anon17 I think your code can be a valid solution.

@tactical-drone
Copy link

All we want in C# is: ByteString.From(ReadOnlyMemory<T>) that does not copy.

Can someone explain to me why this cannot be done?

@jtattermusch
Copy link
Contributor

Recently #7645 has added support for creating ByteString from ReadOnlyMemory, but note that this API is considered "unsafe" and needs to used with care (details are in that PR's thread). That said, it seems that it's solving the issue you're describing.
(you can take the buffer for your memory stream and create an ByteString instance from it directly).

@tactical-drone
Copy link

In don't understand.

This function was added as internal? We are not protobuf developers?

I am very confused.

/// <summary>
/// Internal use only. Ensure that the provided memory is not mutated and belongs to this instance.
/// </summary>
internal static ByteString AttachBytes(ReadOnlyMemory<byte> bytes)
{
    return new ByteString(bytes);
}

@tactical-drone
Copy link

Wait, I did not see UnsafeByteOperations.UnsafeWrap

Everything I need is there.

Good job men!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants