-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
connectivity tests: wait functions improvements and refactoring #1758
Conversation
Converting back to draft to check failures. |
All checks are green (it appears that the previous failures were unrelated from these changes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giorio94 Nice work Marco! A few picks and perhaps another potential refactor.
|
||
// WaitForDeployment waits until the specified deployment becomes ready. | ||
func WaitForDeployment(ctx context.Context, log Logger, client *k8s.Client, namespace string, name string) error { | ||
log.Logf("⌛ [%s] Waiting for deployment %s/%s to become ready...", client.ClusterName(), namespace, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a common pattern across all subsequent waitFor calls. It seems it might be a prime candidates for abstracting out these calls.
Also do these need to be exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a common pattern across all subsequent waitFor calls. It seems it might be a prime candidates for abstracting out these calls.
I agree that the WaitFor
functions look pretty similar among each other, but I'm not seeing many possible improvements. Extracting common parts of the logic (I was thinking mainly about the for
loop) would mean either dropping the customized error messages or just passing them as parameters, which wouldn't be much better. WDYT?
Also do these need to be exported?
Yes, to enable them being reused when cilium-cli packages are used externally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Marco. I see. Figured this is common logic and best to abstract with a closure.
Personally I am not keen about making logging decisions deep in the stack as it should be the responsibility of the call site to determine whether to handle the error, log it, etc... If potentially multi errors can be surfaced then we should return these accordingly. If this is outside the scope of this review, I am good with that but figured I'll flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the clarification and I see your point. I'd personally not change it further in this PR, to avoid modifying the pre-existing behavior. Especially considering the logs, given that the CLI used to output a debug log for every retry in case of errors.
defer cancel() | ||
|
||
for { | ||
err := validateIPCache(ctx, agent, pods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: one liner check
Is part of this PR an attempt to quiet down this error message that we have been seeing in the multicluster workflow?
Fine if not, we can address that elsewhere. |
Nope, we'll need another PR for that. |
This commit introduces a logger interface which abstracts the logging functionalities implemented by the test suite, individual tests and actions. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit extracts, uniforms and generalizes the waitFor* functions used by the connectivity checks, for better separation and to allow them to be reused outside of this package. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit introduces an additional check to wait until all test services have been completely synchronized by the cilium agents on the same nodes hosting the clients before running the tests, to prevent spurious failures. This applies in particular to the multi cluster tests, as affected by additional propagation latency. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Rebased onto main to make CI happy. |
all the feedback comments have been addressed either in this pull request or via some offline discussions. time to 🚢 |
Please refer to the individual commit messages for additional details.