New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Fix cache sync timeout functionality #1428
🐛 Fix cache sync timeout functionality #1428
Conversation
@@ -104,40 +106,21 @@ var _ = Describe("controller", func() { | |||
close(done) | |||
}) | |||
|
|||
It("should wait for each informer to sync", func(done Done) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing this test ever tested was that controller.Start
with a cancelled context doesn't error. With the changes in this PR however it will always error because the context passed from the Controller to WaitForSync is cancelled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case do you think it's worth having a test case that checks that an error is always returned if the context is cancelled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO the only thing someone who starts a Controller with a cancelled context can realistically expect is that it ends quickly, if it errors or not along the way ends up being an implementation detail. It is not going to result in anything useful.
@@ -90,6 +90,7 @@ var _ = Describe("Source", func() { | |||
}, | |||
}, q) | |||
Expect(err).NotTo(HaveOccurred()) | |||
Expect(instance.WaitForSync(context.Background())).NotTo(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to call WaitForSync here because before that finished, the Informer won't have the EventHandler attached. Not waiting makes the test racy.
apidiff fails because:
I don't think Kind being comparable is something we consider part of the API, WDYT? |
3489854
to
6407322
Compare
So far, timing out the cache sync in most realistic scenarios didn't work, bceause source.Kind gets an already started cache from the Manager. A cache that is already started will block forever on GetInformer which we called in source.Kinds start and not its WaitForSync. The context passed to Start however defines the entire lifecycle of the Source, not the Start timeout. This change makes source.Kind call GetInformer in Start but in a new go routine and WaitForSync just wait for that to finish or for its context to be cancelled. This preserves the existing semantic of starting in Start but waint for ready in WaitForSync.
6407322
to
4d059e8
Compare
@estroz addressed your comment, ptal. I've also renamed the pull as the issue wasn't with the cache sync per se but with the cache sync timeout. Just to mention that, another possible approach to this would have been to make the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman, vincepri The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold for @estroz |
@alvaroaleman that depends, can you think of any reason The current approach is fine with me. /lgtm |
/hold cancel |
/retest |
1 similar comment
/retest |
@alvaroaleman: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
So far, timing out the cache sync in most realistic scenarios didn't work,
bceause source.Kind gets an already started cache from the Manager. A
cache that is already started will block forever on GetInformer which we
called in source.Kinds start and not its WaitForSync. The context passed
to Start however defines the entire lifecycle of the Source, not the
Start timeout.
This change makes source.Kind call GetInformer in Start but in a new
go routine and WaitForSync just wait for that to finish or for its
context to be cancelled. This preserves the existing semantic of
starting in Start but waint for ready in WaitForSync.
/cc @varshaprasad96 @estroz @vincepri
Fixes #1346