Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Cassandra connect timeout, not just regular timeout #12903

Merged
merged 2 commits into from Oct 22, 2021

Conversation

ncabatoff
Copy link
Contributor

@ncabatoff ncabatoff commented Oct 22, 2021

It looks like Cassandra sessions have both a "connect timeout" that governs timeouts when initiating connections, and a regular timeout used post-initialization. Our connection producer also has a "connect timeout", but only uses it for setting the latter. I think we should use it for both timeouts.

I think this will help our recurring test failures: from what I can see in the Cassandra code, if you don't set a connect timeout explicitly, it will use 600ms as the default, even if you have a regular timeout configured. Consider this set of test runs from https://app.circleci.com/pipelines/github/hashicorp/vault/22034/workflows/f814993d-7326-4ab5-8ced-a088ab2087f2/jobs/313877/parallel-runs/2

PASS plugins/database/cassandra.TestSelfSignedCA/pem_json/bad_ca (0.13s)
PASS plugins/database/cassandra.TestSelfSignedCA/pem_json/missing_ca (0.01s)
PASS plugins/database/cassandra.TestSelfSignedCA/no_cert_data/tls=true (0.01s)
=== RUN   TestSelfSignedCA/no_cert_data/insecure_tls
2021/10/19 20:41:20 gocql: unable to dial control conn 172.18.0.3:9042: gocql: no response to connection startup within timeout
    connection_producer_test.go:170: no error expected, got: failed to initialize: error verifying connection: error creating session: gocql: unable to create session: control: unable to connect to initial hosts: gocql: no response to connection startup within timeout
    --- FAIL: TestSelfSignedCA/no_cert_data/insecure_tls (0.76s)
FAIL plugins/database/cassandra.TestSelfSignedCA/no_cert_data/insecure_tls (0.76s)
=== RUN   TestSelfSignedCA/pem_json/ca_only
2021/10/19 20:41:21 gocql: unable to dial control conn 172.18.0.3:9042: gocql: no response to connection startup within timeout
    connection_producer_test.go:170: no error expected, got: failed to initialize: error verifying connection: error creating session: gocql: unable to create session: control: unable to connect to initial hosts: gocql: no response to connection startup within timeout
    --- FAIL: TestSelfSignedCA/pem_json/ca_only (0.77s)
FAIL plugins/database/cassandra.TestSelfSignedCA/pem_json/ca_only (0.77s)
PASS plugins/database/cassandra.TestSelfSignedCA/pem_bundle/ca_only (2.12s)

These tests aren't using t.Parallel, so I believe this is all happening in sequence, all against the same Cassandra docker instance. We have failures bracketed by successes, suggesting that the container came up fine, we just weren't always able to connect to it within 600ms.

…overns timeouts when initiating connections, and a regular timeout used post-initialization. Our connection producer also has a "connect timeout", but only uses it for setting the latter. I think we should use it for both timeouts.
@ncabatoff ncabatoff requested a review from a team October 22, 2021 14:17
@@ -189,6 +189,7 @@ func (c *cassandraConnectionProducer) createSession(ctx context.Context) (*gocql
}

clusterConfig.Timeout = c.connectTimeout
clusterConfig.ConnectTimeout = c.connectTimeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me. I see that ClusterConfig.Timeout and ClusterConfig.ConnectTimeout are set to the same value in NewCluster().

I suppose we could easily decouple this in the future.

@vercel vercel bot temporarily deployed to Preview – vault-storybook October 22, 2021 14:34 Inactive
@ncabatoff ncabatoff merged commit 8f7dafe into main Oct 22, 2021
@ncabatoff ncabatoff deleted the fix-cassandra-test-timeouts branch October 22, 2021 15:02
qk4l pushed a commit to qk4l/vault that referenced this pull request Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants