Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workload Cluster API #31

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

johnbelamaric
Copy link
Member

Nephio needs APIs to enable provisioning and management of workload clusters and the workloads we deliver to them. Several different APIs are needed for provisioning and consuming clusters. This PR begins defining some of those APIs.

@nephio-prow nephio-prow bot requested review from henderiw and s3wong May 22, 2023 21:54
@nephio-prow
Copy link
Contributor

nephio-prow bot commented May 22, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from johnbelamaric by writing /assign @johnbelamaric in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@johnbelamaric
Copy link
Member Author

cc @henderiw

Just getting started here. I am building from the bottom up, and then we can think about refactoring as needed. But I think it's easier to discuss with real code.

This first CRD would be used to "configure" the nephio-workload-cluster package, pointing it at different upstreams and downstream managament cluster. I think it could be useful, if a particular deployment wants to use customized versions of the Nephio cluster packages, for example. We would include one of these with the default values in the nephio-workload-cluster package, as a config injection point.

Not sure it's worth it, though, as we already have one level of indirection through Porch repo names. Still thinking on this.

/hold

@johnbelamaric
Copy link
Member Author

We can add another resource that allows more or less these inputs (except we may want some concept of NodePool).

https://github.com/nephio-project/nephio-example-packages/tree/main/cluster-capi-kind#description

But we should keep it to a minimum, and basically say anything beyond the very basics needs be done by tweaking the provider-specific resources in the package itself.

We want just enough that we can make basic tweaks in a simple UI panel, or via a simple injected resource. Anything fancy we need to punt on and have them do it in the package (and those provider specific resources would be editable in all the same ways as our resource).

@johnbelamaric
Copy link
Member Author

And then possible a third resource (if needed) that tells us how to put workloads on the cluster (ie, how to find its repo). We want to differentiate this one, because we may not have provisioned the cluster! We could probably do it via the status of this resource, and allow an empty spec (for example, for a discovered resource). Requires more thought.

@henderiw
Copy link
Contributor

On the first part the packages. What I saw is these packages have dependencies. From the RBAC rules you can actually find them out. E.g. capi main uses cert-manager. So we could do an installer controller that handles the installation based on the references. here is the thought. let's say we provide a catalog of packages. The user selects a bunch and says deploy. We can use this as an inventory what is installed and 2nd the deployer does the dependency map to install them using porch.

@henderiw
Copy link
Contributor

When I looked at cluster some time ago I came to the following aspects that such API would define:

I splatted the CR in 2 parts:

  • cluster: which is mainly the control plane
  • node pool: which is mainly the worker nodes

In control plane here were the elements I was thinking of and this was a search across different cloud providers and on premises scenarios:

  • class -> pointing to a provider
  • region/location info
  • control plane version
  • networking: CIDR for pod/services
  • automation: autoscaling
  • security: private/public access
  • metadata: labeling
  • features: this is mainly the packages you already have which is more advanced

Node pool:

  • networking: CNI
  • automation: autoscaling
  • security: private/public access
  • metadata: labeling/affinity/etc

Just sharing as what I assembled

@henderiw
Copy link
Contributor

Besides this it is important to provide a feedback mechanism on the status of the cluster. It would be nice that other controllers could use this as a generic watch, rather than having to look at all the different implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants