New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore outbound queue coalescing #4093
Conversation
bf8486b
to
e3f9b1f
Compare
server/client.go
Outdated
@@ -302,8 +302,7 @@ type outbound struct { | |||
stc chan struct{} // Stall chan we create to slow down producers on overrun, e.g. fan-in. | |||
} | |||
|
|||
const nbPoolSizeSmall = 512 // Underlying array size of small buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still think this might have been a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add it back in but it didn't bench any better than just 4K+64K buffers. With coalescing the benefit of smaller buffers is reduced because we can just stuff lots of messages into a single buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you test with small payloads, 32, 64 or 128 bytes and watch RSS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, on NATS Core publishes, there seems to be virtually no difference in how two vs three buffers manifest, but in JetStream publishes, some of the benchmarks show better results. I've added the small buffer back as a result.
e3f9b1f
to
f04336e
Compare
Originally I thought there was a race condition happening here, but it turns out it is safe after all and the race condition I was seeing was due to other problems in the WebSocket code. Signed-off-by: Neil Twigg <neil@nats.io>
f04336e
to
2206f9e
Compare
@@ -1447,7 +1442,8 @@ func (c *client) flushOutbound() bool { | |||
// "nb" will be reset back to its starting position so it can be modified | |||
// safely by queueOutbound calls. | |||
c.out.wnb = append(c.out.wnb, collapsed...) | |||
orig := append(net.Buffers{}, c.out.wnb...) | |||
var _orig [1024][]byte | |||
orig := append(_orig[:0], c.out.wnb...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L1479-1485 to return consumed buffers to the pool. The net.Buffers.WriteTo()
function modifies the receiver slice by trimming sent buffers off the beginning, so we need to keep a copy of those references to recycle them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR effectively reverts part of #4084 which removed the coalescing from the outbound queues as I initially thought it was the source of a race condition.
Further investigation has proven that not only was that untrue (the race actually came from the WebSocket code, all coalescing operations happen under the client lock) but removing the coalescing also worsens performance.
Signed-off-by: Neil Twigg neil@nats.io