Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outbound queue improvements #4084

Merged
merged 3 commits into from Apr 21, 2023
Merged

Outbound queue improvements #4084

merged 3 commits into from Apr 21, 2023

Conversation

neilalexander
Copy link
Member

@neilalexander neilalexander commented Apr 21, 2023

This extends the previous work in #3733 with the following:

  1. Remove buffer coalescing, as this could result in a race condition during the writev syscall in rare circumstances
  2. Add a third buffer size, to ensure that we aren't allocating more than we need to without coalescing
  3. Refactor buffer handling in the WebSocket code to reduce allocations and ensure owned buffers aren't incorrectly being pooled resulting in further race conditions

Fixes nats-io/nats.ws#194.

Signed-off-by: Neil Twigg neil@nats.io

Signed-off-by: Neil Twigg <neil@nats.io>
Signed-off-by: Neil Twigg <neil@nats.io>
@neilalexander neilalexander requested a review from a team as a code owner April 21, 2023 14:34
Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tested against the setup from yesterday?

func nbPoolPut(b []byte) {
switch cap(b) {
case nbPoolSizeSmall:
b := (*[nbPoolSizeSmall]byte)(b[0:nbPoolSizeSmall])
nbPoolSmall.Put(b)
case nbPoolSizeMedium:
b := (*[nbPoolSizeMedium]byte)(b[0:nbPoolSizeMedium])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the coalescing, we have to allocate more buffers when traffic is in flight, so I added an extra pool size so that there's middle-ground between small and large messages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the casting before placing back into the pool. We cast when we pull it out, seems we might not need?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the code was doing this before too, the reason is that sync.Pool is most efficient when the contents are pointerised, otherwise it creates lots of copies of the slice header on the heap and they get re-boxed passing through the interface{} boundary.

Signed-off-by: Neil Twigg <neil@nats.io>
@delaneyj
Copy link

Testing now

@delaneyj
Copy link

Previous race issue in go and web socket parse/connection issues not found in testing live app! :shipit:

PS) Browser app is now MUCH faster in full 2.9.16

@derekcollison derekcollison self-requested a review April 21, 2023 16:58
Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@derekcollison derekcollison merged commit 01041ca into main Apr 21, 2023
2 checks passed
@derekcollison derekcollison deleted the neil/outboundqueuesfixes branch April 21, 2023 17:06
neilalexander added a commit that referenced this pull request Apr 25, 2023
This PR effectively reverts part of #4084 which removed the coalescing
from the outbound queues as I initially thought it was the source of a
race condition.

Further investigation has proven that not only was that untrue (the race
actually came from the WebSocket code, all coalescing operations happen
under the client lock) but removing the coalescing also worsens
performance.

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reader closed during delivery tests
3 participants