Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable a user-friendly means of the controller-manager self-identifying with a distinct ID with HTTP calls #1215

Open
akutz opened this issue Oct 12, 2020 · 17 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@akutz
Copy link
Contributor

akutz commented Oct 12, 2020

Summary

This story tracks enabling a user-friendly option for telling controller-runtime to self-identify with a distinct ID when making HTTP calls to the Kubernetes API server.

Context

The actor that causes a raw Kubernetes event to occur is identified by a field named manager, for example:

{
    "type": "MODIFIED",
    "object": {
        "apiVersion": "vmoperator.vmware.com/v1alpha1",
        "kind": "VirtualMachine",
        "metadata": {
            "annotations": {
                "vsphere-cluster-module-group": "tkg7-cluster7-workers-0",
                "vsphere-tag": "WorkerVmVmAATag"
            },
            "creationTimestamp": "2020-10-08T11:41:50Z",
            "finalizers": [
                "virtualmachine.vmoperator.vmware.com"
            ],
            "generation": 19,
            "labels": {
                "capw.vmware.com/cluster.name": "tkg7-cluster7",
                "capw.vmware.com/cluster.role": "node"
            },
            "managedFields": [
                {
                    "apiVersion": "vmoperator.vmware.com/v1alpha1",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": { ... }
                    "manager": "manager",
                    "operation": "Update",
                    "time": "2020-10-08T14:11:49Z"
                }
            ],
            "name": "tkg7-cluster7-workers-whph9-6ff8cdb6fb-fq2b6",
            "namespace": "tkg7",
            ...
        },
        ...
}

These raw, Kubernetes events are useful for root cause analysis (RCA). The field object.metadata.managedFields[0].manager has a value of simply manager. This is the identifier for the actor that caused this MODIFIED event to occur, but manager is not exactly helpful in identifying the component in question.

The reason, if the reader will permit this author to make a broad guess, is that the majority of projects based on controller-runtime and client-go have adopted the pattern of naming the manager binary simply manager for a few reasons:

  • It simplifies Makefiles
  • It simplifies CI
  • It enables a standard pattern for managing/working on these projects 

So, good reasons. Unfortunately it has the side effect that all the components send the same user agent to the API server when updating/patching resources. This happens because:

  • When controller-runtime creates a *rest.Config, it creates a user-agent string for HTTP calls
  • This user-agent string is based on a call to client-go's DefaultKubernetesUserAgent() string, which builds part of the string from os.Args[0]; in other words, the name of the controller manager binary.

Suggestions

Therefore, instead of updating the name of the manager binary for all projects based on controller-runtime, this issue tracks a way to easily provide a unique way to identify the manager. Some suggested ideas are:

  • Standardizing on some environment variable (ex. MANAGER_NAME), flag (ex. --manager-name), or manager option used to override the name given to the user agent

  • Update sig.k8s.io/controller-runtime/pkg/manager.Options with a field like so:

    // Options are the arguments for creating a new Manager
    type Options struct {
        //
        // ...
        //
    
        // Name is an optional identifier to assign to the Manager.
        // This value is part of the User-Agent string used with HTTP calls to the API server,
        // and can help identify this controller-manager as the actor responsible for lifecycle
        // changes in resources, ex. ADDED, MODIFIED, DELETED.
        // Defaults to the name of the controller manager binary, ex. os.Args[0].
        Name string
    
        //
        // ...
        //
    }

Thoughts, concerns?

@vincepri
Copy link
Member

cc @DirectXMan12 @alvaroaleman

I'm +1 for this proposal, we can also leverage #891 to achieve this

@alvaroaleman
Copy link
Member

Providing a convenient way to set a custom user agent 👍

Wouldn't it make more sense to do this in a more granular fashion? E.g. if my manager contains multiple controllers, I most likely want a custom user agent per controller.

@akutz
Copy link
Contributor Author

akutz commented Oct 12, 2020

Providing a convenient way to set a custom user agent 👍

Wouldn't it make more sense to do this in a more granular fashion? E.g. if my manager contains multiple controllers, I most likely want a custom user agent per controller.

100%! Except I'd really want a lot of things, optionally, per controller (like the ability to stop a specific controller). The question is whether to put a moratorium on introducing functionality at the controller-manager scope until we figure out how to create parity at the controller scope, or to address each of these on a case-by-case basis.

I haven't looked at 0.7.x yet, but does it enable specifying a distinct *rest.Config per controller? That would be one way to do it. Another way would be introducing support for controller-specific identification could be something like this:

// sigs.k8s.io/controller-runtime/pkg/context
package context

// ClientName is used to set a key in the Go context. When this key is present, its
// value will be part of the user-agent sent along with HTTP calls to the API server.
type ClientName string

Or just use the name of the Controller (which could also be auto-injected into the context if the above key is not already present).

@alvaroaleman
Copy link
Member

I haven't looked at 0.7.x yet, but does it enable specifying a distinct *rest.Config per controller?

No it doesn't. My suggestion would be something along the lines of:

  • Add an option to the Manager to set the user agent
  • Add a wrapper that clones a Client (and re-uses its restmapper to save on discovery) but with a changed *rest.Config
  • Extend the builder to have a WithUserAgent(ua string) method to allow to use this conveniently

@alvaroaleman
Copy link
Member

Another way would be introducing support for controller-specific identification could be something like this:

Yeah, thats a nice idea as well. Do you think there are use cases for wanting to use more than one UserAgent within the same controller?

@akutz
Copy link
Contributor Author

akutz commented Oct 12, 2020

  • Extend the builder to have a WithUserAgent(ua string) method to allow to use this conveniently

I'm with you all the way until this last part. I don't want to replace the entire user-agent, just the first parameter from:

// DefaultKubernetesUserAgent returns a User-Agent string built from static global vars.
func DefaultKubernetesUserAgent() string {
	return buildUserAgent(
		adjustCommand(os.Args[0]),
		adjustVersion(version.Get().GitVersion),
		gruntime.GOOS,
		gruntime.GOARCH,
		adjustCommit(version.Get().GitCommit))
}

I get we can replace the user-agent to achieve the same result, but I like the default pattern. The above function is in client-go (linked in the description). So perhaps we could introduce WithUserAgentID(userAgentID string) string instead, and simply duplicate the above function into controller-manager like so?

func WithUserAgentID(userAgentID string) string {
	if userAgentID == "" {
		userAgentID = adjustCommand(os.Args[0])
	}
	return buildUserAgent(
		userAgentID,
		adjustVersion(version.Get().GitVersion),
		gruntime.GOOS,
		gruntime.GOARCH,
		adjustCommit(version.Get().GitCommit))
}

(obviously there are some private functions from client-go, like buildUserAgent, etc., we'd have to copy as well, but you get the idea)

See what I mean?

@akutz
Copy link
Contributor Author

akutz commented Oct 12, 2020

Do you think there are use cases for wanting to use more than one UserAgent within the same controller?

For better or worse it is what we do today. I view the problem of having controller-specific IDs as a different issue than creating a solution for an easily settable ID for the controller-manager. An iterative problem if you will.

@alvaroaleman
Copy link
Member

I'm with you all the way until this last part. I don't want to replace the entire user-agent, just the first parameter from:

Yeah, that sounds good as well

For better or worse it is what we do today. I view the problem of having controller-specific IDs as a different issue than creating a solution for an easily settable ID for the controller-manager. An iterative problem if you will.

Sure. But setting it on the cm today just means adjusting the *rest.Config. Adding an option on the cm makes it easier to discover but you can already do this today.

What is your use-case for wanting multiple user-agents in the same controller (not controller-manager)?

@akutz
Copy link
Contributor Author

akutz commented Oct 12, 2020

Sure. But setting it on the cm today just means adjusting the *rest.Config. Adding an option on the cm makes it easier to discover but you can already do this today.

Not entirely accurate. Setting it today for users means supplying a custom user-agent. I am saying that I want to use what's there, for the most part, and just replace the one piece of information used on the server-side for that ID that's part of raw events.

It's the last part that I want to simplify for consumers of controller-runtime -- a built-in helper that takes a single ID and inserts it into the existing, default user-agent.

What is your use-case for wanting multiple user-agents in the same controller (not controller-manager)?

I don't? Perhaps we confused one another? I was simply responding to your initial response with ideas of how to do that. I was not necessarily wanting to do so. I'm not against either.

Again, what's important to me is making it easy to make it easy to identify the actors behind the Kubernetes raw events. If we can increase the granularity to specific controllers, great. But right now there's no easy way to even identify controller-managers. I want to start there. If we go further, that's wonderful. I just don't want going further to block taking the first step.

@christopherhein
Copy link
Member

christopherhein commented Oct 12, 2020

Thanks for @'ing me @vincepri I have a tracking issue as well for once #891 (#895) is done to start integrating the ComponentConfig standard for rest.Config which currently uses https://github.com/kubernetes/component-base/blob/master/config/v1alpha1/types.go#L69 I wonder if we could advocate for having a standard mechanism for user agents being added into that type. @stealthybox @mtaufen

(edit based on slack convo w/ @akutz) because it’s not actually the whole UA that should change it might not make sense to put in the ComponentConfig type.

@DirectXMan12
Copy link
Contributor

FWIW, manager is not actually for identifying the source of the change, precisely -- it's for keeping track of which actor owns which particular field for server-side apply. As you noted above, it's not uncommon for each controller to want it's own manager, because each controller is a separate "actor" that cares about different field sets.

When we're writing documentation, we should keep in mind that this is the actual reason for this field existing.

FWIW, we currently have support for custom managers, but it's opt-in:

client.Update(ctx, &myObj, client.FieldOwner("my-controller"))

This mechanism doesn't care about the controller or whatever that you're running from, so if you really want to have different managers per controller, it's something you can do. That being said, there are implications to that -- SSA will treat those as separate agents for the purposes of doing merges.

I agree it'd be nice to have better defaults.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 11, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 10, 2021
@alvaroaleman
Copy link
Member

/remove-lifecycle rotten
/lifecycle frozen

Addressing this would be great

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Feb 11, 2021
@akutz
Copy link
Contributor Author

akutz commented Sep 2, 2021

@vincepri You were originally in favor of this, but it kind of got mired in a larger discussion. Is there any harm in implementing this as proposed and handling a client-go change separately?

@vincepri
Copy link
Member

Works for me!

@m-messiah
Copy link
Member

Hi! I created #2506 to summarize and implement what was discussed here. Please check and add any comments if something is missing.

Reading the thread and in my personal opinion, I don't think it makes sense to have different controllers with different UserAgents as all of them run in the same Manager and already have FieldRef options, so the change is done for manager.New only, rather than in builders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

8 participants