Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/redisreceiver] Flaky cluster integration test #30411

Open
hughesjj opened this issue Jan 11, 2024 · 4 comments
Open

[receiver/redisreceiver] Flaky cluster integration test #30411

hughesjj opened this issue Jan 11, 2024 · 4 comments
Labels
bug Something isn't working flaky test a test is flaky needs triage New item requiring triage receiver/redis Redis related issues Stale

Comments

@hughesjj
Copy link
Contributor

hughesjj commented Jan 11, 2024

Component(s)

receiver/redisreceiver

What happened?

Sometimes, this test fails. Most(?) of the time, it doesn't

Context

Cluster role assignment is non-deterministic, and can be re-assigned after initialization. While we're using socat to port forward to some replica node, there's a chance that the replica will become 'not a replica' after we grab said port.

background

This also may be related to past issues with kafka and flink, and may also be related to some flakiness in testcontainers-go itself. Regardless, would be nice to investigate further. This is mostly a suspicion from my experience with implementation, but since they've removed network etc it's less likely imo to be the root cause.

Regardless, we have a few receivers which query against clusters whose tests have been flaky and now removed. We should come up with some way to more consistently test such situations, maybe even adding a new build-tag in for cluster-specific tests in case we need to make it easier to separate them from blocking builds until/unless we can figure out a more stable methodology.

Alternatively, we could "clusterize" this receiver more, as it's mostly going off of a single node's reporting for a given receiver instance. As-is, a customer would have to add a receiver for all nodes in a cluster to get "full" visibility, which is a lot of configuration and data. Doing this may obviate the need for any given

Collector version

[main, not yet released post v0.92]

Environment information

Environment

github actions

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@hughesjj hughesjj added bug Something isn't working needs triage New item requiring triage labels Jan 11, 2024
@crobert-1 crobert-1 added receiver/redis Redis related issues flaky test a test is flaky labels Jan 11, 2024
Copy link
Contributor

Pinging code owners for receiver/redis: @dmitryax @hughesjj. See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@hughesjj
Copy link
Contributor Author

Following up with testcontainers-go

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky test a test is flaky needs triage New item requiring triage receiver/redis Redis related issues Stale
Projects
None yet
Development

No branches or pull requests

3 participants