Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

endpoint: create endpoint without labels #30170

Merged
merged 1 commit into from Feb 21, 2024
Merged

Conversation

oblazek
Copy link
Contributor

@oblazek oblazek commented Jan 10, 2024

When endpoint API is used to create an endpoint and EndpointChangeRequest contains labels, it will not allocate an identity to the endpoint.

During NewEndpointFromChangeModel() labels are stored in the endpoint model, causing the followup call ep.UpdateLabels() to not bump revision during this single createEndpoint() call. This means the folloup call to e.runIdentityResolver() never happens and the endpoint ends up without identity and with state waiting-for-identity.

ENDPOINT   IDENTITY        LABELS (source:key[=value])               IPv4         STATUS

3236       <no label id>   k8s:app=incubator-mynetns3                10.247.1.1   waiting-for-identity
                           k8s:io.cilium.k8s.policy.cluster=default
                           k8s:io.kubernetes.pod.namespace=default

This should not happen, otherwise user can only not set the labels during createEndpoint() call and do a followup call patchEndpoint() where labels will be set which then triggers regeneration to be triggered and identity allocated.

Instead epTemplate.Labels don't need to be set which means the above issue never happens as updateLabels() will now bump revision -> i.e. the regeneration is triggered.

With these changes one can createEndpoint() with labels in a single call:

ENDPOINT   IDENTITY   LABELS (source:key[=value])                IPv4         STATUS

331        21864      k8s:app=incubator-mynetns3                 10.247.1.1   ready
                      k8s:io.cilium.k8s.policy.cluster=default
                      k8s:io.kubernetes.pod.namespace=default

Fixes: #29776

endpoint: fix inability to create endpoint with labels in a single API call

Signed-off-by: Ondrej Blazek ondrej.blazek@firma.seznam.cz

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 10, 2024
@oblazek
Copy link
Contributor Author

oblazek commented Jan 10, 2024

/test

@oblazek
Copy link
Contributor Author

oblazek commented Jan 10, 2024

/test

@oblazek oblazek marked this pull request as ready for review January 10, 2024 15:25
@oblazek oblazek requested a review from a team as a code owner January 10, 2024 15:25
@aditighag
Copy link
Member

I'm not familiar with this API as I was only aware of the custom health endpoint being created within the agent, so I'm removing my review assignment. /cc @cilium/sig-agent and @christarazi from the linked issue for review.

@aditighag aditighag requested review from christarazi and removed request for aditighag January 11, 2024 00:11
@squeed
Copy link
Contributor

squeed commented Jan 11, 2024

I'm not sure this is the right solution. Wouldn't it just make more sense to unconditionally start the identity resolver? I don't know why we skip it when rev = 0.

@oblazek
Copy link
Contributor Author

oblazek commented Jan 11, 2024

I'm not sure this is the right solution. Wouldn't it just make more sense to unconditionally start the identity resolver? I don't know why we skip it when rev = 0.

tbh I have no idea, very good point

@aanm aanm self-requested a review January 11, 2024 16:01
@oblazek
Copy link
Contributor Author

oblazek commented Jan 12, 2024

when I am looking closely at the runIdentityResolver().. seems to me it can be safely run unconditionaly as https://github.com/cilium/cilium/blob/main/pkg/endpoint/endpoint.go#L2067 will make sure to run the ID allocation only when really needed

@oblazek
Copy link
Contributor Author

oblazek commented Jan 12, 2024

diff --git a/pkg/endpoint/endpoint.go b/pkg/endpoint/endpoint.go
index 616c2020e6..a6e2764abe 100644
--- a/pkg/endpoint/endpoint.go
+++ b/pkg/endpoint/endpoint.go
@@ -1743,11 +1743,7 @@ func (e *Endpoint) UpdateLabels(ctx context.Context, identityLabels, infoLabels
        // replace identity labels and update the identity if labels have changed
        rev := e.replaceIdentityLabels(identityLabels)
        e.unlock()
-       if rev != 0 {
-               return e.runIdentityResolver(ctx, rev, blocking)
-       }
-
-       return false
+       return e.runIdentityResolver(ctx, rev, blocking)
 }

@oblazek oblazek changed the title endpoint: set labels outside NewEndpointFromChange endpoint: runIdentityResolver unconditionally Jan 16, 2024
@oblazek
Copy link
Contributor Author

oblazek commented Jan 16, 2024

/test

Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Feb 16, 2024
@oblazek
Copy link
Contributor Author

oblazek commented Feb 16, 2024

/test

@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Feb 17, 2024
Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Sorry for taking so much time to review this but now I understood the full picture. I think this patch would be more appropriate. We should prevent executing runIdentityResolver unnecessarily.

diff --git a/daemon/cmd/endpoint.go b/daemon/cmd/endpoint.go
index d995eda77f..ed064bec20 100644
--- a/daemon/cmd/endpoint.go
+++ b/daemon/cmd/endpoint.go
@@ -376,6 +376,16 @@ func (d *Daemon) createEndpoint(ctx context.Context, owner regeneration.Owner, e
                "sync-build":                 epTemplate.SyncBuildEndpoint,
        }).Info("Create endpoint request")
 
+       // We don't need to create the endpoint with the labels. This might cause
+       // the endpoint regeneration to not be triggered further down, with the
+       // ep.UpdateLabels or the ep.RunMetadataResolver, because the regeneration
+       // is only triggered in case the labels are changed, which they might not
+       // change because NewEndpointFromChangeModel would contain the
+       // epTemplate.Labels, the same labels we would be calling ep.UpdateLabels or
+       // the ep.RunMetadataResolver.
+       apiLabels := labels.NewLabelsFromModel(epTemplate.Labels)
+       epTemplate.Labels = nil
+
        ep, err := endpoint.NewEndpointFromChangeModel(d.ctx, owner, d, d.ipcache, d.l7Proxy, d.identityAllocator, epTemplate)
        if err != nil {
                return invalidDataError(ep, fmt.Errorf("unable to parse endpoint parameters: %s", err))
@@ -414,7 +424,6 @@ func (d *Daemon) createEndpoint(ctx context.Context, owner regeneration.Owner, e
                return invalidDataError(ep, err)
        }
 
-       apiLabels := labels.NewLabelsFromModel(epTemplate.Labels)
        infoLabels := labels.NewLabelsFromModel([]string{})
 
        if len(apiLabels) > 0 {

When endpoint is created and `EndpointChangeRequest`
contains labels, it might cause the endpoint regeneration to not be
triggered as it is only triggered when labels are changed.
Unfortunately this does not happen when epTemplate.Labels are set
with the same labels as `EndpointChangeRequest`.

This commit fixes the above issue by not setting epTemplate.Labels.

Fixes: cilium#29776

Signed-off-by: Ondrej Blazek <ondrej.blazek@firma.seznam.cz>
@oblazek
Copy link
Contributor Author

oblazek commented Feb 21, 2024

yeah that totally works!!

@aanm aanm added affects/v1.13 This issue affects v1.13 branch affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch release-note/bug This PR fixes an issue in a previous release of Cilium. and removed release-note/misc This PR makes changes that have no direct user impact. labels Feb 21, 2024
@aanm aanm enabled auto-merge February 21, 2024 15:36
@aanm aanm added the kind/regression This functionality worked fine before, but was broken in a newer release of Cilium. label Feb 21, 2024
@aanm aanm added this pull request to the merge queue Feb 21, 2024
Merged via the queue into cilium:main with commit cb15333 Feb 21, 2024
62 checks passed
@aanm aanm added the needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch label Feb 21, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.8 Feb 21, 2024
@YutaroHayakawa YutaroHayakawa mentioned this pull request Feb 27, 2024
9 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Feb 27, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.15 in 1.15.2 Feb 27, 2024
@YutaroHayakawa YutaroHayakawa mentioned this pull request Feb 27, 2024
5 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Feb 27, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.8 Feb 27, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Feb 28, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.15 in 1.15.2 Feb 28, 2024
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Mar 1, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.15 in 1.15.2 Mar 1, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.8 Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.13 This issue affects v1.13 branch affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. kind/regression This functionality worked fine before, but was broken in a newer release of Cilium. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.15.2
Backport done to v1.15
Development

Successfully merging this pull request may close these issues.

Endpoint transition from waiting-to-identity state to waiting-to-regenerate not supported
5 participants