Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add build-in TLS support #3368

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pjbgf
Copy link
Member

@pjbgf pjbgf commented Dec 4, 2022

Based on the discussion instigated by @rashedkvm.

More context can be found on the recordings from Sept 8th Flux Dev Meeting.

@pjbgf pjbgf added area/rfc Feature request proposals in the RFC format area/security Security related issues and pull requests labels Dec 4, 2022
@pjbgf pjbgf requested a review from rashedkvm December 4, 2022 09:34
Comment on lines +123 to +126
The TLS support could be expanded to the Notification controller webhooks endpoint
to enable secure in-cluster communications.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue I see with supporting this is the hybrid scenario in which we need to support in-cluster and external (via ingress). In such scenarios, the ingress would need to trust the CA used by Flux, which can be problematic from a security stand-point.

@pjbgf pjbgf changed the title rfc: Add build-in TLS support draft RFC-NNNN: Add build-in TLS support draft Dec 4, 2022
@pjbgf pjbgf changed the title RFC-NNNN: Add build-in TLS support draft RFC-NNNN: Add build-in TLS support Dec 4, 2022
@rashedkvm
Copy link
Member

Thank you @pjbgf. Will review shortly.

Copy link
Member

@rashedkvm rashedkvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pjbgf thank you for restarting the encrypted communication discussion for Flux components. Am I understanding this correctly that this feature should not be used in a service mesh setup?

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
@pjbgf
Copy link
Member Author

pjbgf commented Dec 5, 2022

@pjbgf thank you for restarting the encrypted communication discussion for Flux components. Am I understanding this correctly that this feature should not be used in a service mesh setup?

@rashedkvm The use of CNI or Service Mesh are alternatives, which could still be used regardless of the built-in support. However, I cannot see why users would use a Service Mesh to protect Flux inter-component comms when the built-in TLS is enabled, as IMO they are mutually exclusive. Although I could see cases in which some deployments may have CNI encryption + Flux built-in TLS.

Comment on lines +120 to +121
Controllers will watch the TLS secret for changes, once detected it will trigger a
pod restart to refresh the certificates being used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/certwatcher/certwatcher.go and reload the cert without a pod restart. Especially for source-controller, a restart is very expensive and should be avoided.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea. This would cause all mid-flight requests to be dropped, which is a minor impact compared to rebuilding the storage. Besides, the mid-fight requests would be automatically retried within seconds anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely that will be a considerable saving over the expensive restarts.

@rashedkvm
Copy link
Member

rashedkvm commented Dec 5, 2022

@rashedkvm The use of CNI or Service Mesh are alternatives, which could still be used regardless of the built-in support. However, I cannot see why users would use a Service Mesh to protect Flux inter-component comms when the built-in TLS is enabled, as IMO they are mutually exclusive.

that is my understanding too. Thank you for the explanation!

@pjbgf pjbgf marked this pull request as ready for review December 6, 2022 10:48
- Change how CA trust is established across Flux containers.
- Define TLS rotation mechanisms.

## Proposal
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the proposed architecture be compatible with SPIFFE/SPIRE? I can envisage a scenario where I would like to manage identity using SPIFFE SVIDs delivered via SPIRE rather than via Kubernetes TLS secrets. In practice I think this would mean adding a unix domain socket flag to the controllers with an accompanying listener to receive SVID bundles from the SPIRE agent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question @phoban01. I see the built-in TLS support as orthogonal to the support of SPIFFE/SPIRE. The idea here is that users would be able to provision Flux with secure comms without requiring any additional components.

On that point, some service meshes should be able to implement that (mTLS based on SPIFFE/SPIRE) transparently on top of Flux, without any changes required to the source code. However, if there are good use cases for built-in support, that should be a RFC on its own right.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spire support is nice, but will require for the Spire itself to be managed outside Flux. I think using native PKI is a good way of avoiding this as mentioned in the Alternatives section.

@pjbgf pjbgf changed the title RFC-NNNN: Add build-in TLS support [RFC] Add build-in TLS support Dec 7, 2022
@stefanprodan stefanprodan marked this pull request as draft December 9, 2022 09:35
## Proposal

Controllers hosting web servers will introduce a new flag to configure the
[TLS secret]: `--tls-secret`. The Secret type must be `kubernetes.io/tls`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why take dependency on a specific K8s secret hosting approach directly in the components? Can you please use a file reference instead. Then it will be easier to support other secret hosting approaches such as cloud key vaults.
For your information in our enterprise environments hosting any kind of secrets as K8s secrets is prohibited.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you using Flux then? All our APIs rely on Kubernetes Secrets, you can’t connect to Git/OCI/Helm repositories unless you store the auth in a secret.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hosting any kind of secrets as K8s secrets is prohibited

So you're using Secrets Store CSI Driver and mounting them directly into containers? I see why this might be a reasonable policy but as Stefan said, Flux doesn't work without Secrets at the moment and likely never will. Otherwise you'd have to mount all the Secrets any user might ever need into the Flux controller containers.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makkes - correct, we use Secret CSI drivers, or you can create a wrapper/starter script to launch the binaries and this starter script pulls the secrets from key vaults and places them on tempfs volumes.

@stefanprodan - based on the documentation I see that Flux can use cloud identities to access artifact storages. We can still use K8s secrets as a config store for not secret material - such as public keys, application IDs, etc. But we cannot use K8s secrets for storing secret material like private keys.
Most Kubernetes components consume secrets as file references - then if someone wants to use K8s secrets to store secrets they can always mount them as files in the volume.

Consuming secret material directly from K8s secret resources is a gap that must be fixed for Flux to be used in different deployments that rely on consuming secrets as files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consuming secrets as files breaks Flux multi-tenancy. We rely on Kubernetes RBAC to isolate tenant, reading files from the container FS is not subject to RBAC while reading secrets from the API is. Have you considered enabling encryption at rest for etcd?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be confused here. But wouldn't the TLS k8s secret also mounted as a file in the pod. The certwatcher project suggested allows file based certs only.
https://github.com/kubernetes-sigs/controller-runtime/blob/a784ee78d04bdf216006adfbc313328a0aded88a/pkg/certwatcher/certwatcher.go#L36

Thus not sure how giving an option for a TLS on file-based key inside the pod can be less secure as even with secrets we need to do the same.

I understand going default with secret to not add any additional dependencies for the default secure path. But if the customer is willing to do the work to use a Secret CSI drivers to mount a file directly, shouldn't that be allowed.

Copy link
Member

@makkes makkes Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Secrets aren't mounted into the Flux Pods but read into memory through the Kubernetes API at runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a flag for pointing to a cert on disk, then it's up to the Flux admin to figure out the mounts and so on. With this flag, we'll read the cert at startup only, the Flux admin can bump some pod annotation in Git to restart the controllers to reload the cert. This flag e.g. --tls-file should be mutually exclusive with --tls-secret, if both are set, the controllers will panic and crash loop.

Copy link

@anagg929 anagg929 Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks , this should resolve the cases where admin want to mount the cert on their own.
A bit of query, why not use use the certwatcher on the --tls-file also to avoid the restart ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use use the certwatcher on the --tls-file also to avoid the restart ?

Ah I haven't looked into that, if it supports it, then sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rfc Feature request proposals in the RFC format area/security Security related issues and pull requests
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

8 participants