New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize writing strings across multiple buffers #7663
Optimize writing strings across multiple buffers #7663
Conversation
36f0c26
to
500b722
Compare
Added to this PR is initializing the initial buffer with IBufferWriter. Before:
After:
|
03150ce
to
38c5018
Compare
Rebased to fix merge conflict @jtattermusch Could you please take a look |
{ | ||
if (state.writeBufferHelper.codedOutputStream?.InternalOutputStream != null) | ||
{ | ||
Debug.Assert(sizeHint == 0, "CodedOutputStream does not support sizeHint."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We (.NET team) use Debug.Assert to check for things that shouldn't happen, but aren't worth checking at runtime because they aren't caused by external input. I couldn't find any other uses of Debug.Assert outside of test code.
What do you want to do here:
- Remove this?
- Verify are runtime (check and throw an exception)
- Ok to leave as is
// Encoder will keep state of unwritten data. | ||
if (state.stringEncoder == null) | ||
{ | ||
state.stringEncoder = Encoding.UTF8.GetEncoder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same problem with caching the stringEncoder as in the other PR. Not seeing stringEncoder.Reset() anywhere seems suspicious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it so that the convert loop will stop when completed
is true. That ensures there is no remaining state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be safer to just call Reset() proactively because it would make the reasoning much easier.
Like this if an exception is thrown, the internal state of encoder will be maintained. I know normally throwing would invalidate the entire write context, so this should be fine in theory, but it's hard to predict all the possible patterns of use so I think relying on this is fragile.
How much more expensive would be to call Reset() proactively each time? (sounds like it should be cheap and if done so there would be absolutely no question if the encoder could be polluted or not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this if an exception is thrown, the internal state of encoder will be maintained.
Is CodedOutputStream suppose to be safe to use after it throws an exception? We could put it in a try/catch and reset on an error, but it doesn't seem useful if other state (buffer content, position, written output, etc) is in invalid states.
How much more expensive would be to call Reset() proactively each time? (sounds like it should be cheap and if done so there would be absolutely no question if the encoder could be polluted or not).
I can test, but the completed
result says that the internal buffer has been emptied so Reset would never do anything.
The
completed
output parameter indicates whether all the data in the input buffer was converted and stored in the output buffer.
bdf2448
to
b4b7393
Compare
[Arguments(3)] | ||
[Arguments(4)] | ||
[Arguments(5)] | ||
public void WriteRawVarint32_WriteContextBufferWriter_FirstWrite(int encodedSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the purpose of this benchmark? it looks like this benchmark doesn't logically belong to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it, looks like this is related to the WriteBufferHelper.Initialize change below. Ideally would be a separate PR, but since it's already here, let's leave it.
@@ -382,6 +401,23 @@ public void WriteString_WriteContext(int encodedSize) | |||
ctx.CheckNoSpaceLeft(); | |||
} | |||
|
|||
[Benchmark] | |||
[ArgumentsSource(nameof(StringEncodedSizes))] | |||
public void WriteString_WriteContextBufferWriter_LimitBufferSize(int encodedSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : _WriteContextBufferWriter_
-> can be just _BufferWriter_
as "WriteContext" is redundant information here?
Also below.
@@ -71,7 +72,7 @@ public static void Initialize(IBufferWriter<byte> bufferWriter, out WriteBufferH | |||
{ | |||
instance.bufferWriter = bufferWriter; | |||
instance.codedOutputStream = null; | |||
buffer = default; // TODO: initialize the initial buffer so that the first write is not via slowpath. | |||
buffer = bufferWriter.GetSpan(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this also means that just initializing a writeContext from IBufferWriter will have some non-zero cost (e.g. potentially allocating a buffer). Not sure if there are any scenarios where users create a write context and then decide not to write anything (e.g. because of an error, or because they are actually writing an empty message) but such scenarios would get more expensive by doing this.
Of course under normal circumstances for non-empty messages, this is a good optimization. So we can do it, but let's be aware of the extra overhead for empty writes.
Btw Ideally this change would belong to a separate PR (together with the corresponding benchmark).
buffer[state.position + i] = (byte)value[i]; | ||
} | ||
state.position += length; | ||
// String doesn't fit in refreshed buffer. Write across multiple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unfinished sentence in the comment?
if (length == value.Length) // Must be all ASCII... | ||
// String doesn't fit in the remaining buffer. | ||
// Refreshing the buffer could free up enough space to write string. | ||
WriteBufferHelper.RefreshBuffer(ref buffer, ref state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this optimization is pretty speculative and I think maintaining the current invariant of always writing all the way to the end of the current buffer is better for now.
We've just made a big change to how the serialization works internally and the old code has always filled the current buffer to its end so I think it's better be a bit conservative, wait for things to settle down a little bit and only then change invariants things like this. I'm also not convinced that this is always going to be a performance benefit.
Looks like this optimization is pretty much independent of all the other logic in the PR, so I'd like to consider it separately (and also benchmark it separately).
@@ -47,6 +47,7 @@ internal static class WritingPrimitives | |||
{ | |||
// "Local" copy of Encoding.UTF8, for efficiency. (Yes, it makes a difference.) | |||
internal static readonly Encoding Utf8Encoding = Encoding.UTF8; | |||
private const int MaximumBytesPerUtf8Char = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds right, but it would be good to put some evidence for this in a comment. (e.g. link to the spec or e.g. a similar constant in UtfEncoding.GetMaxByteCount's implementation)
// Encoder will keep state of unwritten data. | ||
if (state.stringEncoder == null) | ||
{ | ||
state.stringEncoder = Encoding.UTF8.GetEncoder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be safer to just call Reset() proactively because it would make the reasoning much easier.
Like this if an exception is thrown, the internal state of encoder will be maintained. I know normally throwing would invalidate the entire write context, so this should be fine in theory, but it's hard to predict all the possible patterns of use so I think relying on this is fragile.
How much more expensive would be to call Reset() proactively each time? (sounds like it should be cheap and if done so there would be absolutely no question if the encoder could be polluted or not).
// Refresh the buffer with a minimum size of the maximum unicode char byte size. | ||
// A minimum buffer size is required because at least one unicode character must | ||
// be written when Encoder.Convert is called. | ||
WriteBufferHelper.RefreshBuffer(ref buffer, ref state, sizeHint: MaximumBytesPerUtf8Char); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of introducing the need for having a sizeHint in refreshbuffer for just this specific corner case.
It sounds like having less than MaximumBytesPerUtf8Char bytes left would be a pretty rare condition, so perhaps if that happens, we could write those bytes into a stackalloc'd buffer and then write the resulting bytes into the destination using WriteRawBytes(span)?
I'm going to close this request given that there was no response to the last round of comments, but feel free to open again and respond! |
Cherry picked from #7576 (comment)
Before:
After (not caching Encoder):
After (caching Encoder):