Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Provide a truly lazy restmapper #2116

Merged
merged 2 commits into from Feb 6, 2023

Conversation

Fedosin
Copy link
Contributor

@Fedosin Fedosin commented Jan 2, 2023

This PR adds a rest mapper that will lazily query the provided client for discovery information to do REST mappings. This allows you to significantly reduce the number of requests to the API server, as currently we do a request for each registered GroupVersion at startup. It also makes the startup time of the controller more predictable and independent of the number of CRDs installed in the system.

Fixes: #1603

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 2, 2023
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 2, 2023
@Fedosin
Copy link
Contributor Author

Fedosin commented Jan 2, 2023

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 2, 2023
Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a worthwile improvement to the controller-runtime project, with large numbers of CRDs, we've seen rest mapping take a long time and cache large amounts of data that isn't really necessary.

I think there are a few improvements that could be made to make this a little cleaner, and one logical optimisation, see my comments and let me know what you think

pkg/client/apiutil/lazyrestmapper.go Outdated Show resolved Hide resolved
pkg/client/apiutil/lazyrestmapper.go Outdated Show resolved Hide resolved
pkg/client/apiutil/lazyrestmapper.go Outdated Show resolved Hide resolved
pkg/client/apiutil/lazyrestmapper.go Outdated Show resolved Hide resolved
pkg/client/apiutil/lazyrestmapper.go Outdated Show resolved Hide resolved
Copy link
Member

@lobziik lobziik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly agree with @JoelSpeed comments. Have just an extra nit about docstring for filteredServerGroupsAndResources.

pkg/client/apiutil/lazyrestmapper.go Outdated Show resolved Hide resolved
pkg/client/apiutil/lazyrestmapper.go Outdated Show resolved Hide resolved
@Fedosin Fedosin force-pushed the lazy_loading branch 2 times, most recently from d06b559 to 2967f3b Compare January 4, 2023 15:42
@Fedosin Fedosin force-pushed the lazy_loading branch 3 times, most recently from 7a302ea to cc59ac4 Compare January 5, 2023 15:43
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 9, 2023
@Fedosin
Copy link
Contributor Author

Fedosin commented Jan 9, 2023

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2023
Comment on lines 31 to 33
// LazyRESTMapper is a RESTMapper that will lazily query the provided
// client for discovery information to do REST mappings.
type LazyRESTMapper struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if we though actually about making this RESTMapper the default implementation, truthfully it sounds like that the benefit of making less calls, and only making them when needed are something that a lot of users might want.

If we decide to go down this path, we should however provide a feature gate of sorts that enables this as an experiment first (opt in) -> on by default -> then enable it and remove the gate.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Initially I didn't plan to make it default but rather to use it either for clusters with a slow API server, or for very crd-pumped ones (ahem, OpenShift), when we have to do more than a hundred requests to initialize the mapper.

But after multiple optimizations I could say there is no worsening relative to the current implementation, only improvements. So, it's reasonable to make it default after some testing.

Do you have an example of such feature gating in this project, so I can mark this feature as experimental?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, we test this implementation with OpenShift here: openshift/cluster-cloud-controller-manager-operator#213

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need to introduce a feature gate, which we currently don't have support for. For this PR, we can focus immediately to do some sort of split-logic between how things were done in the default RESTMapper, and this way.

If we feel comfortable though and have enough testing coverage all around, we can mark this a breaking change (⚠️) and add extensive release notes and ask for feedback.

@alvaroaleman wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to simply add this restmapper for now and backport it into 0.14. 0.15 onwards (or later, depending on if we find issues) we can simply make it the default but leave the option to provide a custom one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have the implementation of this PR inlined with the current one, but all of it behind a feature gate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you featuregate this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new DynamicRESTMapperOption that sets the flag could work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like: WithExperimentalLazyMapper

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

@alvaroaleman
Copy link
Member

Sorry, this is on my queue but I am a bit swamped right now

Copy link
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fedosin one comment to optimize this even further, other than that this looks good to me. Thanks a ton for your work and sorry it took so long.

pkg/client/apiutil/lazyrestmapper/lazyrestmapper.go Outdated Show resolved Hide resolved
Comment on lines 31 to 33
// LazyRESTMapper is a RESTMapper that will lazily query the provided
// client for discovery information to do REST mappings.
type LazyRESTMapper struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging the two for next release sounds good to me.

I think the new lazy restmapper does everything the current dynamic one does btw. The current dynamic one will on a resource not found error reload the entire restmapping (with some ratelimiting). Since the new one will simply try to load the restmapping on-demand if it doesn't have it cached, that should solve the same problem just as well.

@alvaroaleman
Copy link
Member

/cherrypick release-0.14

@k8s-infra-cherrypick-robot

@alvaroaleman: once the present PR merges, I will cherry-pick it on top of release-0.14 in a new PR and assign it to you.

In response to this:

/cherrypick release-0.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

This commit adds a rest mapper that will lazily query the provided
client for discovery information to do REST mappings.
Instead of creating the instance of the mapper directly, we will use
WithExperimentalLazyMapper option for Dynamic Restmapper.
@Fedosin
Copy link
Contributor Author

Fedosin commented Feb 6, 2023

@alvaroaleman @vincepri hey folks! I updated this PR as you asked - now there is a "featuregate" called WithExperimentalLazyMapper that enables the Lazy. PTAL :)

Copy link
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you so much!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, Fedosin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 6, 2023
@k8s-ci-robot k8s-ci-robot merged commit f127c11 into kubernetes-sigs:master Feb 6, 2023
@k8s-infra-cherrypick-robot

@alvaroaleman: new pull request created: #2179

In response to this:

/cherrypick release-0.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

func (m *lazyRESTMapper) findAPIGroupByName(groupName string) (metav1.APIGroup, error) {
// Ensure that required info about existing API groups is received and stored in the mapper.
// It will make 2 API calls to /api and /apis, but only once.
if m.apiGroups == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry very late :)

Q: We only ever seem to write m.apiGroups here. What if the apiGroups change after that? Or do we assume that's not relevant as the CRDs a controller uses must be deployed before the controller?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the question above, as far as I can see the apiGroups is never refreshed which might cause issues for groups that were installed after the mapper gets created AND if someone calls RESTMapping without a version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
Q: does it make sense to track this in an issue and makes sure this has an answer before pushing further the adoption of this mapper?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, the code should work for 97% of uses within controller runtime; but the above is definitely an issue we should fix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference this has been addressed in #2208 and released in v0.14.5

chlunde added a commit to sparebank1utvikling/provider-kubernetes that referenced this pull request May 8, 2023
provider-kubernetes initialized a new kubernetes client for each
reconcile. The REST mapper in controller-runtime used to fetch
information about every CRD in the cluster.

controller-runtime introduced a lazy restmapper which means we don't
have to introduce any complex caching to get a significant performance
boost in provider-kubernetes:

kubernetes-sigs/controller-runtime#2116

This seems to become the default in the next release:

kubernetes-sigs/controller-runtime#2296

But this is so significant that we want to update now:

* CPU reduced from constant throttling at 0.4 cores to 0.04 cores

* CloudWatch / EKS audit log costs reduced significantly (55% for our
  cluster, with a lot of provider-kubernetes resources)

Signed-off-by: Carl Henrik Lunde <chlunde@ifi.uio.no>
chlunde added a commit to sparebank1utvikling/provider-kubernetes that referenced this pull request May 8, 2023
provider-kubernetes initialized a new kubernetes client for each
reconcile. The REST mapper in controller-runtime used to fetch
information about every CRD in the cluster.

controller-runtime introduced a lazy restmapper which means we don't
have to introduce any complex caching to get a significant performance
boost in provider-kubernetes:

kubernetes-sigs/controller-runtime#2116

This seems to become the default in the next release:

kubernetes-sigs/controller-runtime#2296

But this is so significant that we want to update now:

* CPU reduced from constant throttling at 0.4 cores to 0.04 cores

* CloudWatch / EKS audit log costs reduced significantly (55% for our
  cluster, with a lot of provider-kubernetes resources)

Signed-off-by: Carl Henrik Lunde <chlunde@ifi.uio.no>
chlunde added a commit to sparebank1utvikling/provider-kubernetes that referenced this pull request Jun 26, 2023
provider-kubernetes initialized a new kubernetes client for each
reconcile. The REST mapper in controller-runtime used to fetch
information about every CRD in the cluster.

controller-runtime introduced a lazy restmapper which means we don't
have to introduce any complex caching to get a significant performance
boost in provider-kubernetes:

kubernetes-sigs/controller-runtime#2116

This seems to become the default in the next release:

kubernetes-sigs/controller-runtime#2296

But this is so significant that we want to update now:

* CPU reduced from constant throttling at 0.4 cores to 0.04 cores

* CloudWatch / EKS audit log costs reduced significantly (55% for our
  cluster, with a lot of provider-kubernetes resources)

Signed-off-by: Carl Henrik Lunde <chlunde@ifi.uio.no>
@Fedosin Fedosin deleted the lazy_loading branch July 15, 2023 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a truly lazy restmapper
9 participants