-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
service-cache: fix memory leak and missed updates on scale to zero events in rare circumstances #24619
Merged
julianwiedmann
merged 2 commits into
cilium:master
from
giorio94:mio/service-cache-endpoints
Mar 30, 2023
Merged
service-cache: fix memory leak and missed updates on scale to zero events in rare circumstances #24619
julianwiedmann
merged 2 commits into
cilium:master
from
giorio94:mio/service-cache-endpoints
Mar 30, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Currently, a service is reported as ready if the local Endpoint resource has been found, or it has at least one endpoint in remote clusters. This commit changes slightly the logic, reporting the service ready also if any remote service has been found (even though with 0 endpoints), to prevent that an update is possibly missed on scale to zero events. In particular, the issue can be triggered in case the local service has no selector (hence k8s creates neither an Endpoint nor an EndpointSlice object), while the remote one is standard. When the deployment targeted by the remote cluster is scaled to 0, the service entry in the local cluster is not correctly cleared. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Previously, upon detecting the deletion of a global service in a remote cluster, we removed the corresponding external endpoints. Still, we did not delete the map associated with that service when no remote endpoints were left. This commit fixes this, and also ensures that the service entry is deleted if no longer ready. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
giorio94
added
kind/bug
This is a bug in the Cilium logic.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
area/clustermesh
Relates to multi-cluster routing functionality in Cilium.
affects/v1.11
This issue affects v1.11 branch
affects/v1.12
This issue affects v1.12 branch
needs-backport/1.13
This PR / issue needs backporting to the v1.13 branch
affects/v1.13
This issue affects v1.13 branch
labels
Mar 29, 2023
giorio94
changed the title
service-cache: fix memory leak and missed updates on scale to zero events
service-cache: fix memory leak and missed updates on scale to zero events in rare circumstances
Mar 29, 2023
/test |
ConformanceK8sKind hit new flake #24622 |
/test-1.24-5.4 Hit #15455 |
aanm
approved these changes
Mar 29, 2023
/test-1.26-net-next Hit #24573 |
borkmann
approved these changes
Mar 30, 2023
youngnick
approved these changes
Mar 30, 2023
maintainer-s-little-helper
bot
added
the
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
label
Mar 30, 2023
Note that if this was meant to close a related issue, you will need to close it manually :) |
Nope, I just forgot it there from the template. |
jibi
added
backport-pending/1.13
The backport for Cilium 1.13.x for this PR is in progress.
and removed
needs-backport/1.13
This PR / issue needs backporting to the v1.13 branch
labels
Apr 3, 2023
maintainer-s-little-helper
bot
moved this from Needs backport from master
to Backport pending to v1.13
in 1.13.2
Apr 3, 2023
jibi
added
the
backport-done/1.13
The backport for Cilium 1.13.x for this PR is done.
label
Apr 7, 2023
jibi
removed
the
backport-pending/1.13
The backport for Cilium 1.13.x for this PR is in progress.
label
Apr 7, 2023
maintainer-s-little-helper
bot
moved this from Backport pending to v1.13
to Backport done to v1.13
in 1.13.2
Apr 7, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affects/v1.11
This issue affects v1.11 branch
affects/v1.12
This issue affects v1.12 branch
affects/v1.13
This issue affects v1.13 branch
area/clustermesh
Relates to multi-cluster routing functionality in Cilium.
backport-done/1.13
The backport for Cilium 1.13.x for this PR is done.
kind/bug
This is a bug in the Cilium logic.
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes two issues which could affect the endpoints handling in the service cache, concerning remote services:
Please, refer to the respective commit messages for more details.