Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADDED] Multiple routes and ability to have per-account routes #4001

Merged
merged 6 commits into from Apr 3, 2023

Conversation

kozlovic
Copy link
Member

New configuration fields:

cluster {
   ...
   pool_size: 5
   accounts: ["A", "B"]
}

The configuration pool_size in the example above means that this
server will create 5 routes to a remote server, assuming that that
server has the same pool_size setting.

Accounts (which are not part of the accounts[] configuration)
are assigned a specific route in this pool, and this will be the
same route on all servers in the cluster.

Accounts that are defined in the accounts field will each have
a dedicated route connection. This will allow suppression of the
account name in some of the route protocols, reducing bytes transmitted
which may increase performance.

Signed-off-by: Ivan Kozlovic ivan@synadia.com

@kozlovic kozlovic requested a review from a team as a code owner March 29, 2023 21:52
Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! Few minor comments. Thanks again!

server/events_test.go Outdated Show resolved Hide resolved
server/jetstream.go Show resolved Hide resolved
server/opts.go Outdated Show resolved Hide resolved
server/route.go Outdated Show resolved Hide resolved
server/route.go Show resolved Hide resolved
server/route.go Outdated Show resolved Hide resolved
server/route.go Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
@kozlovic kozlovic force-pushed the route_pooling_and_per_account branch from 5ae156b to b907112 Compare March 30, 2023 15:43
@kozlovic
Copy link
Member Author

@derekcollison I have updated the PR with 2 commits, one addressing your comments, the second fixes some tests that missed some cleanup code that caused possibly other tests to fail.

However, I find that the following test is often failing (even on my machine, macOS bare metal):

=== RUN   TestNoRaceJetStreamSuperClusterStreamMoveLongRTT
    norace_test.go:5588: require no error, but got: context deadline exceeded
--- FAIL: TestNoRaceJetStreamSuperClusterStreamMoveLongRTT (23.24s)

This is true even in dev and main branches so I am confident that this is unrelated to changes in this PR.

More worrisome, is this test:

=== RUN   TestNoRaceJetStreamInterestStreamCheckInterestRaceBug
    norace_test.go:7043: Ackfloor not correct yet
--- FAIL: TestNoRaceJetStreamInterestStreamCheckInterestRaceBug (24.89s)

It passes 10/10 on main but fails 10/10 on dev (so again, unrelated to this PR). I can't get the build green because of these 2.

@derekcollison
Copy link
Member

@derekcollison I have updated the PR with 2 commits, one addressing your comments, the second fixes some tests that missed some cleanup code that caused possibly other tests to fail.

However, I find that the following test is often failing (even on my machine, macOS bare metal):

=== RUN   TestNoRaceJetStreamSuperClusterStreamMoveLongRTT
    norace_test.go:5588: require no error, but got: context deadline exceeded
--- FAIL: TestNoRaceJetStreamSuperClusterStreamMoveLongRTT (23.24s)

This is true even in dev and main branches so I am confident that this is unrelated to changes in this PR.

More worrisome, is this test:

=== RUN   TestNoRaceJetStreamInterestStreamCheckInterestRaceBug
    norace_test.go:7043: Ackfloor not correct yet
--- FAIL: TestNoRaceJetStreamInterestStreamCheckInterestRaceBug (24.89s)

It passes 10/10 on main but fails 10/10 on dev (so again, unrelated to this PR). I can't get the build green because of these 2.

I am working on the RaceBug test and why it is failing on dev, I am close to having a PR so do not worry about that one.
The other one is timing, there was a bug that would replay leadership xfers after restart that caused tests to have better timing then they should.

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

For the test fixes, possibly to do those on main and we can inherit them or do separate PR for them on main?

@kozlovic
Copy link
Member Author

For the tests, done in: #4005

@kozlovic kozlovic force-pushed the route_pooling_and_per_account branch from e4d78a0 to bf0df0d Compare March 30, 2023 21:48
@kozlovic
Copy link
Member Author

@derekcollison I cherry-picked the tests fixes an submitted the PR to main, which was approved by @wallyqs and merged into main, then into dev and I have rebased this PR out of dev, which dropped 3 commits that were now upstream.

@derekcollison
Copy link
Member

@kozlovic could you fix conflicts? Apologies was most likely from my fix to dev branch just now.

As soon as its cleared will merge.

New configuration fields:
```
cluster {
   ...
   pool_size: 5
   accounts: ["A", "B"]
}
```

The configuration `pool_size` in the example above means that this
server will create 5 routes to a remote server, assuming that that
server has the same `pool_size` setting.

Accounts (which are not part of the `accounts[]` configuration)
are assigned a specific route in this pool, and this will be the
same route on all servers in the cluster.

Accounts that are defined in the `accounts` field will each have
a dedicated route connection. This will allow suppression of the
account name in some of the route protocols, reducing bytes transmitted
which may increase performance.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
One should not access s.opts directly but instead use s.getOpts().
Also, server lock needs to be released when performing an account
lookup (since this may result in server lock being acquired).
A function was calling s.LookupAccount under the client lock, which
technically creates a lock inversion situation.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Issues could manifest with subscription interest not properly
propagated.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Because of the lack of `defer ac.Cleanup()` in some tests, the
accounts would still try to send conn updates, which was possibly
causing data races with some of the tests that change the
eventsHBInterval global variable.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@kozlovic kozlovic force-pushed the route_pooling_and_per_account branch from bf0df0d to 83c5c01 Compare April 3, 2023 15:32
@kozlovic
Copy link
Member Author

kozlovic commented Apr 3, 2023

@derekcollison I have rebased and fixed conflicts and run on Travis is green.

@derekcollison derekcollison merged commit 1ae51b2 into dev Apr 3, 2023
2 checks passed
@derekcollison derekcollison deleted the route_pooling_and_per_account branch April 3, 2023 22:33
kozlovic added a commit that referenced this pull request May 8, 2023
This is a fix for PR #4001.
If a server has an s2_auto configuration, the compression level
needs to be updated based on the RTT, however, this should not
happen if a particular route is actually not using compression,
either because it is a connection to an older server or the other
side has explicitly configure compression to be "off".

Extended a test that would have caught this issue.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request May 19, 2023
This is a rework of incorrect changes made in PR #4001.
This affects only the `dev` branch.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request May 19, 2023
This is a rework of incorrect changes made in PR #4001.
This affects only the `dev` branch.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
derekcollison added a commit that referenced this pull request May 19, 2023
This is a rework of incorrect changes made in PR #4001. Changes did not
work for changes to export permissions. Test was modified to add export
changes.

This affects only the `dev` branch.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants