Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect server latencies and implement "prefer faster hosts" when creating connections #3178

Open
vlsi opened this issue Mar 24, 2024 · 11 comments

Comments

@vlsi
Copy link
Member

vlsi commented Mar 24, 2024

Describe the issue

Currently, pgjdbc connects to hosts in sequential order, however, it becomes suboptimal if one of the hosts experiences issues.

For instance, if connect tohost1 takes 20 seconds, and connect to host2 takes 0.2 sec.
The driver could account for that and prefer connecting to host2.

Sample implementation:

  1. Collect exponential moving average of the connect time and login time for each host
  2. When running in "load-balance" mode, use weighted shuffle for the hosts so the faster ones are preferred
  3. Additionally, we can collect exponential moving average for simple queries like "commit", "isvalid", so we better account for "server health"

Cons:

  • Since we can't "reconnect", as we create more connections to the faster host, it might cause workload skew

Connection rebalance:

  • We can't migrate a connection to a different host in a transparent way, however, we can "unconnect" by returning false form isValid
  • An alternative option would be to disconnect after a certain amount of time, or after a certain time spent in executing queries
    Unfortunately, "treat idle connection as expired after some time" will not be fully transparent to the application as the application might use custom_variable_classes for connection-level variables which would be lost on disconnect. However, we should be fine to (optionally) disconnect when the connection is returned to the connection pool.
@davecramer
Copy link
Member

A wrapper driver would be able to reconnect. I like the idea

@rob-bygrave
Copy link

Is this addressing an actual issue that has been observed to occur? Does this issue occur frequently? Is this expected to be put into a separate optional module/jar or is this expected to be part of the core driver?

@vlsi
Copy link
Member Author

vlsi commented Mar 25, 2024

The actual issue is that a single misbehaving replica might cause noticeable issues when creating connections.

The most pronounced case is primary switchover.

  • Imagine there are 6 replicas and the primary.
  • Imagine connect timeout is 10 seconds (current default). Remember it does not include TLS + login sequence

The driver would attempt each replica in sequence until it detects the primary host. It might take 6*10 = 60 sec.
Hikari sets the default login timeout to 30 seconds (it passes loginTimeout=30 to pgjdbc), and it causes failure to connect to the database no matter how many times we try so.


Currently the driver caches "ok, we know this was primary" information, and it enables us to connect to the primary faster. However, we do not cache "ok, we know this host times out, let's try the other hosts to see if they are any better". So I thought we could lower the priority for the servers that are known to have issues (e.g. connect timeouts)

@davecramer
Copy link
Member

Is there no topology information? Most systems provide some topology somewhere ?
FWIW, I still think this is better in a wrapper. This https://github.com/awslabs/aws-advanced-jdbc-wrapper is obviously targeted at RDS Aurora, but has a plugin for Patroni (there's topology info available)

@vlsi
Copy link
Member Author

vlsi commented Mar 25, 2024

The issue is that the current implementation does not handle the case well, and it does not specify "please do not use it, use something else instead". So the users end up with half-workable systems :-/

@rbygrave
Copy link
Contributor

Imagine there are 6 replicas and the primary.

I'm not using that scenario where the driver itself is doing master failover / looping trying to find the next master. Master failover is either via DNS or its AWS RDS. Hmmm.

@davecramer
Copy link
Member

This is actually one of the topics I want to bring up in Vancouver at the postgresql dev conference is to find a way to put topology data in the server. Presumably it would be a hook of some kind to accommodate various technologies. Patroni being one, but RDS has a few, I presume Azure and Google have theirs.

@rbygrave
Copy link
Contributor

Random thoughts ...

Imagine there are 6 replicas and the primary.

An option might be to start an ExecutorService with 6 threads, test each replica in parallel? (something along the lines of testing the likely candidates in parallel)

topology data in the server

Yeah, looks like for the Aurora case, in order to speed up master failover it can connect to a replica instance, read the updated topology to find the new master defined there, then connect/point to new master. Seems like it would be really nice if there could be some standardisation on the topology (common pg views that drivers can read etc).

@davecramer
Copy link
Member

topology data in the server

Yeah, looks like for the Aurora case, in order to speed up master failover it can connect to a replica instance, read the updated topology to find the new master defined there, then connect/point to new master. Seems like it would be really nice if there could be some standardisation on the topology (common pg views that drivers can read etc).

Well not every situation is the same. Will be hard to standardize. The aws wrapper drivers uses plugins to deal with this.

@vlsi
Copy link
Member Author

vlsi commented Mar 25, 2024

An option might be to start an ExecutorService with 6 threads, test each replica in parallel? (something along the lines of testing the likely candidates in parallel)

Fair point.

Imagine the app starting with a connection pool of 50 connections.
50 connections * 6 threads == 300 concurrent requests :)


A side note: the default mode is loadBalanceHosts=false.
It might be hard to guess, however, my guess is that the majority of the users who configure several hostnames need loadBalanceHosts=true.

@sehrope
Copy link
Member

sehrope commented Mar 25, 2024

This is actually one of the topics I want to bring up in Vancouver at the postgresql dev conference is to find a way to put topology data in the server.

This is a really good topic as there's only so much you can do without some server changes. And a couple small changes would go a long way.

Imagine the app starting with a connection pool of 50 connections.
50 connections * 6 threads == 300 concurrent requests :)

This is why I like stuff like this being disconnected from the driver itself. Determining which node to connect could be done once by a higher level pool and then it would re-use that information for the entire set of N connections in the pool.

Having it done by every JDBC connection on its own gets both the thundering herd situation you're describing and potentially inconsistency between how each connection makes it's determination.

This is a pretty complicated topic and I think we should focus on providing the building blocks to build solutions atop it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants