You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that minio doesn't process any changes made to the CA files while the certs were renewed.
In our specific deployment we've combined the requestAutoCert: true setting with an externalCertSecret which is an external cert-manager issued LetsEncrypt certificate that we use to achieve E2E encryption via a passthrough Ingress object. I'm not sure if this contributes to the issue.
When curling the endpoint manually, you can see that it's already serving the renewed certificate (valid from Mar 28 16:51:34 2024 GMT):
$ curl -vik https://miniotenant-ss-0-1.miniotenant-hl.cld-1225.svc.cluster.local:9000/
* Trying 10.129.4.20...
* TCP_NODELAY set
* Connected to miniotenant-ss-0-1.miniotenant-hl.cld-1225.svc.cluster.local (10.129.4.20) port 9000 (#0)
<...>
* Server certificate:
* subject: O=system:nodes; CN=system:node:*.miniotenant-hl.cld-1225.svc.cluster.local
* start date: Mar 28 16:51:34 2024 GMT
* expire date: Apr 15 23:29:11 2024 GMT
* issuer: CN=kube-csr-signer_@1710631750
* <...>
which matches the renewed CA within the container:
I suspect, this is the CA that should be used by minio, however it seems like minio is still using an outdated one from memory to verify requests to its cluster members, as the error message tls: failed to verify certificate: x509: certificate signed by unknown authority suggests. It looks like the renewed certificate file itself was read from disk, but the CA file wasn't.
Of course, the /tmp/certs/CAs/ directory also contains the Root CA of the Letsencrypt authority R3 (externalCertSecret), but that'll only become an issue when that specific certificate is renewed in a couple of weeks. So we'll ignore it for now.
Expected Behavior
Minio should automatically process certificate renewals and all tenant pods must process changes made to the CA files as well.
Current Behavior
This issue can be resolved temporarily (until next certificate renewal) by restarting the minio service with the mc CLI:
mc admin service restart <tenant>
This proves that this isn't an issue with the certificates themselves, but with minio not processing the renewed certs correctly, as this command doesn't alter the certificate files at all. They are probably just re-read from disk when the service is restarted.
Wait until the certificate and CA is renewed by the Operator
Check log for requests to other cluster members, they should fail with the mentioned error message, also the Tenant's object /status/healthMessage is Service Unavailable
Context
Regression
Your Environment
Version used (minio-operator): minio-operator v5.0.13
Environment name and version (e.g. kubernetes v1.17.2): Red Hat OpenShift v4.12
Server type and version: minio/minio:RELEASE.2024-02-17T01-15-57Z
Operating System and version (uname -a): n.a.
Link to your deployment file: see "Possible Solution" above
The text was updated successfully, but these errors were encountered:
Could you please try disabling it and use our example, or something similar? This will still require manual steps while performing this process, but at least you won't rely on Operator certificates anymore, only on cert-manager. Once we have a working solution for the rotation, this shouldn't cause any further problems.
Also, if you get a chance, please try the ideas from the following PRs and let us know if they work for you in OpenShift:
the reason we are merging the externalCertSecret (issued by a cert-manager instance) and requestAutoCert (issued by the MinIO Operator) is because you cannot request certificates for cluster-internal domain names (.svc.cluster.local) via cert-manager.
"The certificate request has failed to complete and will be retried: Failed to wait for order resource "minio-console-cld-1225.apps.<openshift-cluster>.com-kwmf5-4021417717" to become ready: order is in "errored" state: Failed to create Order: 400 urn:ietf:params:acme:error:rejectedIdentifier: Error creating new order :: Cannot issue for "*.cld-1225.svc.cluster.local": Domain name does not end with a valid public suffix (TLD) (and 2 more problems. Refer to sub-problems for more information.); subproblems:\n\turn:ietf:params:acme:error:malformed: [dns: *.cld-1225.svc.cluster.local] Error creating new order :: Domain name does not end with a valid public suffix (TLD)\n\turn:ietf:params:acme:error:malformed: [dns: *.minio.cld-1225.svc.cluster.local] Error creating new order :: Domain name does not end with a valid public suffix (TLD)\n\turn:ietf:params:acme:error:malformed: [dns: *.miniotenant-hl.cld-1225.svc.cluster.local] Error creating new order :: Domain name does not end with a valid public suffix (TLD)
So in order to get valid certificates for inter-pod communication (meaning between multiple MinIO cluster member pods) we need requestAutoCert: true.
Our minio tenant pods fail regularly whenever the internal certificate was renewed by the operator:
It appears that minio doesn't process any changes made to the CA files while the certs were renewed.
In our specific deployment we've combined the
requestAutoCert: true
setting with anexternalCertSecret
which is an external cert-manager issued LetsEncrypt certificate that we use to achieve E2E encryption via a passthrough Ingress object. I'm not sure if this contributes to the issue.When
curl
ing the endpoint manually, you can see that it's already serving the renewed certificate (valid from Mar 28 16:51:34 2024 GMT):which matches the renewed CA within the container:
I suspect, this is the CA that should be used by minio, however it seems like minio is still using an outdated one from memory to verify requests to its cluster members, as the error message
tls: failed to verify certificate: x509: certificate signed by unknown authority
suggests. It looks like the renewed certificate file itself was read from disk, but the CA file wasn't.Of course, the
/tmp/certs/CAs/
directory also contains the Root CA of the Letsencrypt authority R3 (externalCertSecret
), but that'll only become an issue when that specific certificate is renewed in a couple of weeks. So we'll ignore it for now.Expected Behavior
Minio should automatically process certificate renewals and all tenant pods must process changes made to the CA files as well.
Current Behavior
This issue can be resolved temporarily (until next certificate renewal) by restarting the minio service with the
mc
CLI:This proves that this isn't an issue with the certificates themselves, but with minio not processing the renewed certs correctly, as this command doesn't alter the certificate files at all. They are probably just re-read from disk when the service is restarted.
Possible Solution
n.a.
Steps to Reproduce (for bugs)
/status/healthMessage
isService Unavailable
Context
Regression
Your Environment
minio-operator
): minio-operator v5.0.13uname -a
): n.a.The text was updated successfully, but these errors were encountered: