Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pool imbalance when using DNS for contact points #1576

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mpenick
Copy link
Contributor

@mpenick mpenick commented Aug 17, 2021

When a hostname is used for contact points it's resolved initially as
part of the initialization process then HostInfo objects are created
with the original hostname and the resolved connectAddress.
hostname is then used to dial the pool connections when causes another
DNS resolve which could result in a different IP then the original
connectAddress because A records can change order for each resolve.
This results in a connection pool for a given IP address containing
connections to multiple different IP addresses.

This patch removes the second resolve when dialing by setting the
hostname member to the resolved IP in the initialization step.

Resolves #1575

When a hostname is used for contact points it's resolved initially as
part of the initialization process then `HostInfo` objects are created
with the original `hostname` and the resolved `connectAddress`.
`hostname` is then used to dial the pool connections when causes another
DNS resolve which could result in a different IP then the original
`connectAddress` because A records can change order for each resolve.
This results in a connection pool for a given IP address containing
connections to multiple different IP addresses.

This patch removes the second resolve when dialing by setting the
`hostname` member to the resolved IP in the initialization step.

Resolves gocql#1575
@martin-sucha
Copy link
Contributor

Looks like the fix for this won't be as simple as this one line change.
We currently don't have tests for TLS connections, but this change could potentially break TLS if we don't store the hostname. Hostname is used for SNI, see martin-sucha@43a8f9b (although I'm not sure if there are users that actually use this).

@mpenick
Copy link
Contributor Author

mpenick commented Aug 18, 2021

Thanks for the heads up. I can take a look today at a fix that doesn't break TLS.

The original hostname used by the contact point will be used to verify
the CN/SAN when using TLS.
@mpenick
Copy link
Contributor Author

mpenick commented Aug 19, 2021

Pushed up another fix that keeps the original hostname used in the contact points. It should be noted that hosts discovered through the peers table won't have valid hostnames because that table only contains IP addresses so they cannot be verified using a hostname in the CN/SANs. To support the case where a contact point contains a hostame (to be verified) it would mean certificates would need to contain a valid CN/SAN for the hostname and a host specific IP address in another SAN.

@@ -254,14 +253,9 @@ func (s *Session) dialWithoutObserver(ctx context.Context, host *HostInfo, cfg *
// be modified after being used.
tlsConfig := cfg.tlsConfig
if !tlsConfig.InsecureSkipVerify && tlsConfig.ServerName == "" {
colonPos := strings.LastIndex(addr, ":")
if colonPos == -1 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check might not have been necessary because addr := host.HostnameAndPort() would have always added the :.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Resolving DNS when connecting pool connections can lead to connection imbalances
2 participants