-
-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug]: Webhook installation is not idempotent #525
Labels
bug
Something isn't working
Comments
I think I'm hitting this as well... $ kubectl logs my-operator-8657dd45c6-f5xsp
Defaulted container "operator" out of: operator, webhook-installer (init)
Error from server (BadRequest): container "operator" in pod "my-operator-8657dd45c6-f5xsp" is waiting to start: PodInitializing $ kubectl logs my-operator-8657dd45c6-f5xsp -c webhook-installer
info: ApplicationStartup[0]
Registered validation webhook.
Download cfssl / cfssljson for linux.
Make unix binaries executable.
Generating server certificate.
2023/02/14 16:54:26 [INFO] generate received request
2023/02/14 16:54:26 [INFO] received CSR
2023/02/14 16:54:26 [INFO] generating key: ecdsa-256
2023/02/14 16:54:27 [INFO] encoded CSR
2023/02/14 16:54:27 [INFO] signed certificate
Files in /certs:
/certs/server-key.pem
/certs/ca.pem
/certs/server.csr
/certs/server.pem
Create service.
Create validator definition.
Unhandled exception. k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'Conflict'
at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.k8s.ICustomObjectsOperations.CreateClusterCustomObjectWithHttpMessagesAsync(Object body, String group, String version, String plural, String dryRun, String fieldManager, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.GenericClient.CreateAsync[T](T obj, CancellationToken cancel)
at KubeOps.KubernetesClient.KubernetesClient.Create[TResource](TResource resource)
at KubeOps.Operator.Commands.Management.Webhooks.Install.OnExecuteAsync(CommandLineApplication app)
at McMaster.Extensions.CommandLineUtils.Conventions.ExecuteMethodConvention.InvokeAsync(MethodInfo method, Object instance, Object[] arguments)
at McMaster.Extensions.CommandLineUtils.Conventions.ExecuteMethodConvention.OnExecute(ConventionContext context, CancellationToken cancellationToken)
at McMaster.Extensions.CommandLineUtils.Conventions.ExecuteMethodConvention.<>c__DisplayClass0_0.<<Apply>b__0>d.MoveNext()
--- End of stack trace from previous location ---
at McMaster.Extensions.CommandLineUtils.CommandLineApplication.ExecuteAsync(String[] args, CancellationToken cancellationToken)
at Program.<Main>$(String[] args) in /operator/Program.cs:line 34
at Program.<Main>(String[] args) Even after clearing the resources with the command below, I'm unable to reinstall the operator. $ kubectl delete -k src/config/install The packages versions I'm using... <PackageReference Include="KubeOps" Version="7.0.7" />
<PackageReference Include="KubeOps.KubernetesClient" Version="7.0.7" />
<PackageReference Include="KubernetesClient" Version="10.0.31" /> Running on minikube $ kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.26.1
Kustomize Version: v4.5.7
Server Version: v1.26.1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Webhooks are typically installed in the init container of the operator pod. It is possible, that the webhook installation fails, e.g., due to a connection loss to the Kubernetes API. In such cases, the init container is restarted until it succeeds.
However, if the
ValidatingWebhookConfiguration
orMutatingWebhookConfiguration
are already created, the init container throws an exception (see below) hinting to a conflict with an already existing resource in the cluster.This is due to that IMHO the method
KubernetesClient.Save
(https://github.com/buehler/dotnet-operator-sdk/blob/master/src/KubeOps.KubernetesClient/KubernetesClient.cs#L116) is not implemented correctly: for the decision whether to create or update a resource in the cluster, it checks whether the uid of the resource given as the argument to the method is null. In the webhook installation (and similar places in the framework), the resource given to theSave
method are always freshly created, and therefore the uid is always null - independently of the possibility, that the resource already might exist in the cluster. A proper implementation of theSave
method would check the existence of the resource in the cluster instead.Another option would be to use the same pattern as for the service for the
WebhookConfigurations
, i.e., delete the already existing resource in the cluster before. I am not sure whether there is a reason whySave
was used instead.To reproduce
MutatingWebhookConfiguration
of the right name beforehand).ValidatingWebhookConfiguration
already got created)Expected behavior
The webhook installation should succeed eventually.
Screenshots
The exception thrown by the webhook installer:
Additional Context
Kubernetes: v1.23
KubeOps: 7.0.6
The text was updated successfully, but these errors were encountered: