Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C# MergeFrom ReadOnlySpan with no allocations #7885

Closed
MaximGurschi opened this issue Sep 11, 2020 · 11 comments
Closed

C# MergeFrom ReadOnlySpan with no allocations #7885

MaximGurschi opened this issue Sep 11, 2020 · 11 comments
Assignees
Labels
c# inactive Denotes the issue/PR has not seen activity in the last 90 days.

Comments

@MaximGurschi
Copy link

Hi,

With the latest release of Grpc.Tools, we are now able to execute code that writes the serialized message directly to Span without incurring allocations. This is based on IBufferMessage. Thank you for that release!

My questions is what is the way to achieve the reverse process? Specifically without allocating a CodedInputStream or any other heap based objects. Is there (or are the plans for) an API that can help us achieve the following:

  1. Given an instantiated (new allocated) Message - M,
  2. Execute M.MergeFrom(ReadOnlySpan)
  3. The above to complete without allocating heap memory (for things like CodedInputStream, etc.)

I have been using a custom solution to achieve the above steps but it involves subclassing Stream and a buffer copy to a preallocated buffer. If there were to be a native solution to achieve the equivalent of IMessage.MergeFrom(ReadOnlySpan) without additional allocation then that would help get rid of the workaround code.

@MaximGurschi
Copy link
Author

MaximGurschi commented Oct 6, 2020

The issue with my solution is that the CodedInputStream eventually hits the state.sizelimit, so i have to allocate a new one every ~2GB of data.

A native solution that does M.MergeFrom(ReadOnlySpan) without allocating any CodedInputStream would be great.

@jtattermusch
Copy link
Contributor

Currently there is

YourMessage.Parser.ParseFrom(ReadOnlySequence<byte>) which allows you to parse without needing a CodedInputStream (and getting a ReadOnlySequence without requiring allocations is relatively easy).

public IMessage ParseFrom(ReadOnlySequence<byte> data)

There is also a MergeFrom variant of this method, but that one is internal.

internal static void MergeFrom(this IMessage message, ReadOnlySequence<byte> data, bool discardUnknownFields, ExtensionRegistry registry)

Does that satisfy your needs?

@MaximGurschi
Copy link
Author

MaximGurschi commented Oct 27, 2020

@jtattermusch Thank you for that. Unfortunately no it does not as that API allocates the holding proto. My goal is to reuse the proto and also not allocate the CIS.

@jtattermusch
Copy link
Contributor

@MaximGurschi alright, so you'd basically need a public API for msg.MergeFrom(ReadOnlySequence data) (something along the lines of the latter internal method I posted), is that correct?

@MaximGurschi
Copy link
Author

@jtattermusch Precisely :)

@jtattermusch
Copy link
Contributor

Feel free to come up with a PR that proposes the addition of new public method then?

@Jensaarai
Copy link
Contributor

Jensaarai commented Dec 4, 2020

@jtattermusch it seems like it would still be useful to have overloads that take ReadOnlySpan, as sometimes that's what you're given. For instance, Confluent.Kafka's IDeserializer provides a ReadOnlySpan, which would then have to be copied to the heap to use the ReadOnlySequence overload :/

T Deserialize(ReadOnlySpan<byte> data, bool isNull, SerializationContext context);

Is there something I'm missing that would allow ReadOnlySpan<byte> to ReadOnlySequence<byte> conversion without allocations or copying?

@jtattermusch
Copy link
Contributor

@jtattermusch it seems like it would still be useful to have overloads that take ReadOnlySpan, as sometimes that's what you're given. For instance, Confluent.Kafka's IDeserializer provides a ReadOnlySpan, which would then have to be copied to the heap to use the ReadOnlySequence overload :/

T Deserialize(ReadOnlySpan<byte> data, bool isNull, SerializationContext context);

Is there something I'm missing that would allow ReadOnlySpan<byte> to ReadOnlySequence<byte> conversion without allocations or copying?

Adding overloads that parse directly from ReadOnlySpan<byte> is definitely feasible (and it's something you can currently achieve using internal APIs) - it's just we want to add new public APIs carefully, to ensure that the public API has small footprint, doesn't degrade after time and will allow us preserve backwards compatibility.

FTR see

internal static void Initialize(ref ReadOnlySpan<byte> buffer, ref ParserInternalState state, out ParseContext ctx)

ParseContext.Initialize(data, out ParseContext ctx);

Feel free to propose a new public API for parsing from ReadOnlySpan<> and open a pull request (note that there are some other enhancement requests around the newly added APIs - please check the github issues for them first).

@jtattermusch
Copy link
Contributor

slightly related:
#8036
#7886

Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago.

@github-actions github-actions bot added the inactive Denotes the issue/PR has not seen activity in the last 90 days. label Mar 10, 2024
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it.

This issue was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c# inactive Denotes the issue/PR has not seen activity in the last 90 days.
Projects
None yet
Development

No branches or pull requests

4 participants