Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusion with do-loadbalancer-hostname #698

Open
charlesg99 opened this issue Mar 13, 2024 · 3 comments
Open

Confusion with do-loadbalancer-hostname #698

charlesg99 opened this issue Mar 13, 2024 · 3 comments

Comments

@charlesg99
Copy link

I'm not sure if I read enough on the subject, but I still don't get how "service.beta.kubernetes.io/do-loadbalancer-hostname" is used by the ingress controller. I guess having a domain instead of an IP forces a dns request that exits the cluster? I just fixed a cert-manager problem with this annotation and I also don't get why this happenned, my other domains on the same load balancer/cluster didn't have this "pod-pod" network issue when creating ssl certificates.

My real issue is that I don't know if having set this annotation will prevent my other domains from correctly renewing their ssl certificates. Can you clarify this and mention this use case in the documentation?

@timoreimann
Copy link
Collaborator

Hey @charlesg99 👋

Technically speaking, the annotation really only serves a single need, which is to return a hostname from the LB status (the related code is fairly straight forward) that will later be injected into the LoadBalancer-typed Service object. This, in turn, causes Kubernetes to not do hair-pinning and instead route via the external LB IP address.

I don't immediately see how the annotation / the related Kubernetes limitation could be related to your cert-manager problem: unless your setup is somehow specific / unusual, cert-manager should just talk to the API server and possibly public endpoints (e.g., to get certificates renewed). Neither should require routing through pods via a managed LB. I'm wondering if you adding the annotation had some kind of side effect that addressed your specific issue, but wasn't directly tied to the technical functionality in CCM described above.
If you still have data from when cert-manager failed for you (e.g., logs, error messages, events) that could be helpful in doing root cause analysis. Otherwise, you could try to force a certificate renewal on a test setup and troubleshoot based on that.

@charlesg99
Copy link
Author

Thanks for the answer, all I know if that when it failed, the http01 acme challenge was accessible from outside the cluster but certmanager failed to resolve it. Same issue as this.

I have many domains that all use the same automated deployment (same ingress resource for all of them, the only change is the domain that's being changed by helm values) and never had this certificate issue before. After all the basic checks (dns and such), I ended up reading that it seems like a common issue with an external loadbalancer in front of the cluster's ingress.

cert-manager/cert-manager#3238 (comment)
kubernetes/kubernetes#66607 (comment)
digitalocean/Kubernetes-Starter-Kit-Developers#205 (comment)

As I said, I don't mind adding the annotation with one of the domains that resolves to the loadbalancer, but I just want to make sure that if I set "mydomain.com", it won't prevent the certificate renewal of "myotherdomain.com" down the line.

Since I added the annotation with domain "X" yesterday, I installed a different domain "Y" and its certificate generated correctly so it doesn't seem to affect it 🤞. Would be nice to have a confirmation though and I would have liked to get it from reading the documentation :)

@timoreimann
Copy link
Collaborator

(Apologies, I thought I had responded to this one some time ago but apparently I hadn't 🤦 )

AFAIU, the comments from the first two linked issues seem more related to the routing problem that the hostname annotation is supposed to address.

The third one does go more into the problem you're facing. I'm no expert in Let's Encrypt and the http01 challenge in particular, but what I could image happening is that cert-manager is actually executing the self-check. If that's the case and the domain name is pointing at the DO LB, then the requests would be bypassing the LB and thereby possibly breaking the LE / validation flow? That'd explain why the hostname annotation fixes the issue as it forces requests to leave the cluster.

I don't think it'd affect any other domain you might have given that the annotation should only impact the routing path from a pod towards the LB IP address. If anything, I'd argue that setting it should lead to a more "natural" behavior for most use cases as requests take a full roundtrip.

Let me know if that makes sense to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants