You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user reported that they ran an update which deleted resources from Pulumi's state, but left the actual resources on the Kubernetes cluster. This was apparently due to a network partition that occurred between the preview and update steps, so no prior warning was given.
Although the state was recoverable from the previous update, this was a bad user experience, and we should reconsider the way we handle state for unreachable clusters.
Updating (development):
Type Name Status Info
pulumi:pulumi:Stack azure-kubernetes-cluster-development
└─ osimis:AzureKubernetesCluster osimis-lify-k8s
- ├─ kubernetes:helm.sh:Chart ingress-gloo deleted
- │ ├─ kubernetes:core:ServiceAccount ingress-gloo-qq119b7f/discovery deleted 1 warning
- │ ├─ kubernetes:core:ServiceAccount ingress-gloo-qq119b7f/gloo deleted 1 warning
- │ ├─ kubernetes:core:ConfigMap ingress-gloo-qq119b7f/ingress-envoy-config deleted 1 warning
- │ ├─ kubernetes:core:ConfigMap ingress-gloo-qq119b7f/gloo-usage deleted 1 warning
- │ ├─ kubernetes:core:Service ingress-gloo-qq119b7f/ingress-proxy deleted 1 warning
- │ ├─ kubernetes:rbac.authorization.k8s.io:ClusterRoleBinding gloo-role-binding-ingress-ingress-gloo-qq119b7f deleted 1 warning
- │ ├─ kubernetes:apps:Deployment ingress-gloo-qq119b7f/discovery deleted 1 warning
- │ ├─ kubernetes:rbac.authorization.k8s.io:ClusterRole gloo-role-ingress deleted 1 warning
- │ ├─ kubernetes:apps:Deployment ingress-gloo-qq119b7f/ingress deleted 1 warning
- │ ├─ kubernetes:core:Service ingress-gloo-qq119b7f/gloo deleted 1 warning
- │ ├─ kubernetes:apps:Deployment ingress-gloo-qq119b7f/ingress-proxy deleted 1 warning
- │ └─ kubernetes:apps:Deployment ingress-gloo-qq119b7f/gloo deleted 1 warning
- ├─ kubernetes:core:Namespace ingress-gloo deleted 1 warning
- └─ kubernetes:cert-manager.io:ClusterIssuer nginx-cluster-issuer deleted 1 warning
Diagnostics:
kubernetes:apps:Deployment (ingress-gloo-qq119b7f/discovery):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:core:ConfigMap (ingress-gloo-qq119b7f/gloo-usage):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:core:Namespace (ingress-gloo):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:core:ServiceAccount (ingress-gloo-qq119b7f/gloo):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:core:ConfigMap (ingress-gloo-qq119b7f/ingress-envoy-config):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:cert-manager.io:ClusterIssuer (nginx-cluster-issuer):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:core:ServiceAccount (ingress-gloo-qq119b7f/discovery):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:rbac.authorization.k8s.io:ClusterRoleBinding (gloo-role-binding-ingress-ingress-gloo-qq119b7f):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:rbac.authorization.k8s.io:ClusterRole (gloo-role-ingress):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:apps:Deployment (ingress-gloo-qq119b7f/gloo):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:apps:Deployment (ingress-gloo-qq119b7f/ingress):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:core:Service (ingress-gloo-qq119b7f/gloo):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:apps:Deployment (ingress-gloo-qq119b7f/ingress-proxy):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
kubernetes:core:Service (ingress-gloo-qq119b7f/ingress-proxy):
warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get https://<mycluster>.azmk8s.io:443/openapi/v2?timeout=32s: net/http: TLS handshake timeout
Outputs:
<outputs redacted>
Resources:
- 15 deleted
69 unchanged
Reproducing the issue
Create a stack with resources deployed to Kubernetes
Remove a resource from the program and run pulumi up
After preview, but before confirming the update, make the cluster inaccessible (change kubeconfig or similar out of band)
Apply the update
Suggestions for a fix
The intent for the current behavior was to unblock users who had inadvertently deleted their Kubernetes cluster prior to cleaning up resources deployed to that cluster. If the cluster is unreachable prior to preview, a descriptive warning is shown before deleting the resources from the state.
Rather than defaulting to deleting resources from the state, it would be better to require a force-delete option for users who need to fix invalid state.
The text was updated successfully, but these errors were encountered:
This is still happening. All my resources were deleted due to not being able to connect to the cluster.
I literally only did a pulumi refresh but since it spits a hell lot of warnings regarding kubernetes deprecations, I missed some cluster unreacheable warnings and applied the changes.
I reverted the stack state back, and did a refresh only on the KubernetesCluster resource, in which a new kubectl was fetched, but for some reason it doesn't use that kubeconfig. (note that I'm doing the status.apply() call just like the published examples that was supposed to cover this scenario)
After that everything is broken because we can't even run commands due to integrity checking pulumi has.
Any updates on this one? Or at least a workaround?
I tried doing a refresh on the Provider, didn't work either.
You should be able to revert to any previous checkpoint state with the following: pulumi stack export --version=<previous-version-number> > out
followed by pulumi stack import --file=out
That should get your stack back into a good state so you can resume updates as normal.
Problem description
A user reported that they ran an update which deleted resources from Pulumi's state, but left the actual resources on the Kubernetes cluster. This was apparently due to a network partition that occurred between the preview and update steps, so no prior warning was given.
Although the state was recoverable from the previous update, this was a bad user experience, and we should reconsider the way we handle state for unreachable clusters.
Related: #491 #881
Errors & Logs
See #881 (comment)
Reproducing the issue
pulumi up
Suggestions for a fix
The intent for the current behavior was to unblock users who had inadvertently deleted their Kubernetes cluster prior to cleaning up resources deployed to that cluster. If the cluster is unreachable prior to preview, a descriptive warning is shown before deleting the resources from the state.
Rather than defaulting to deleting resources from the state, it would be better to require a force-delete option for users who need to fix invalid state.
The text was updated successfully, but these errors were encountered: