Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectivity tests: wait functions improvements and refactoring #1758

Merged
merged 3 commits into from
Jun 27, 2023

Conversation

giorio94
Copy link
Member

Please refer to the individual commit messages for additional details.

@giorio94 giorio94 requested a review from a team as a code owner June 23, 2023 16:57
@giorio94 giorio94 requested a review from derailed June 23, 2023 16:57
@giorio94 giorio94 temporarily deployed to ci June 23, 2023 16:57 — with GitHub Actions Inactive
@giorio94
Copy link
Member Author

Converting back to draft to check failures.

@giorio94 giorio94 marked this pull request as draft June 23, 2023 17:10
@giorio94 giorio94 temporarily deployed to ci June 26, 2023 06:42 — with GitHub Actions Inactive
@giorio94
Copy link
Member Author

All checks are green (it appears that the previous failures were unrelated from these changes).
Marking again ready for review.

@giorio94 giorio94 marked this pull request as ready for review June 26, 2023 07:49
Copy link
Contributor

@derailed derailed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@giorio94 Nice work Marco! A few picks and perhaps another potential refactor.

connectivity/check/deployment.go Outdated Show resolved Hide resolved

// WaitForDeployment waits until the specified deployment becomes ready.
func WaitForDeployment(ctx context.Context, log Logger, client *k8s.Client, namespace string, name string) error {
log.Logf("⌛ [%s] Waiting for deployment %s/%s to become ready...", client.ClusterName(), namespace, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a common pattern across all subsequent waitFor calls. It seems it might be a prime candidates for abstracting out these calls.
Also do these need to be exported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a common pattern across all subsequent waitFor calls. It seems it might be a prime candidates for abstracting out these calls.

I agree that the WaitFor functions look pretty similar among each other, but I'm not seeing many possible improvements. Extracting common parts of the logic (I was thinking mainly about the for loop) would mean either dropping the customized error messages or just passing them as parameters, which wouldn't be much better. WDYT?

Also do these need to be exported?

Yes, to enable them being reused when cilium-cli packages are used externally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Marco. I see. Figured this is common logic and best to abstract with a closure.
Personally I am not keen about making logging decisions deep in the stack as it should be the responsibility of the call site to determine whether to handle the error, log it, etc... If potentially multi errors can be surfaced then we should return these accordingly. If this is outside the scope of this review, I am good with that but figured I'll flag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the clarification and I see your point. I'd personally not change it further in this PR, to avoid modifying the pre-existing behavior. Especially considering the logs, given that the CLI used to output a debug log for every retry in case of errors.

connectivity/check/wait.go Show resolved Hide resolved
defer cancel()

for {
err := validateIPCache(ctx, agent, pods)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: one liner check

connectivity/check/wait.go Show resolved Hide resolved
@asauber
Copy link
Member

asauber commented Jun 26, 2023

Is part of this PR an attempt to quiet down this error message that we have been seeing in the multicluster workflow?

2023-06-26T06:50:33.155740215Z E0626 06:50:33.155593      26 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request

Fine if not, we can address that elsewhere.

@giorio94
Copy link
Member Author

Is part of this PR an attempt to quiet down this error message that we have been seeing in the multicluster workflow?

2023-06-26T06:50:33.155740215Z E0626 06:50:33.155593      26 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request

Fine if not, we can address that elsewhere.

Nope, we'll need another PR for that.

This commit introduces a logger interface which abstracts the logging
functionalities implemented by the test suite, individual tests and
actions.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit extracts, uniforms and generalizes the waitFor* functions
used by the connectivity checks, for better separation and to allow them
to be reused outside of this package.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit introduces an additional check to wait until all test
services have been completely synchronized by the cilium agents on
the same nodes hosting the clients before running the tests, to
prevent spurious failures. This applies in particular to the multi
cluster tests, as affected by additional propagation latency.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94
Copy link
Member Author

Rebased onto main to make CI happy.

@michi-covalent
Copy link
Contributor

all the feedback comments have been addressed either in this pull request or via some offline discussions. time to 🚢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants