Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging #1826

Closed
sbueringer opened this issue Mar 4, 2022 · 11 comments · Fixed by #1827
Closed

Improve logging #1826

sbueringer opened this issue Mar 4, 2022 · 11 comments · Fixed by #1827

Comments

@sbueringer
Copy link
Member

sbueringer commented Mar 4, 2022

I would like to propose to adjust CR logging.

Today, controller-runtime logs have the following format: (when JSON logging is used)

{
  "ts": 1646398972336.9634,
  "caller": "cluster/cluster_controller.go:105",
  "controller": "cluster",
  "reconciler group": "cluster.x-k8s.io",
  "reconciler kind": "Cluster",
  "msg": "controller/cluster: Waiting for DockerCluster",
  "v": 0,
  "name": "capi-quickstart",
  "namespace": "default"
}

I would like to propose the following format instead:

{
  "ts": 1646398972336.9634,
  "caller": "cluster/cluster_controller.go:105",
  "controller": "cluster",
  "controllerGroup": "cluster.x-k8s.io",
  "controllerKind": "Cluster",
  "msg": "Waiting for DockerCluster",
  "v": 0,
  "name": "capi-quickstart",
  "namespace": "default"
  "cluster": {
    "name": "capi-quickstart",
    "namespace": "default"
  },
  "reconcileID": "56e044eb38fb76741ffe41cd83e7bed5"
}

The delta is:

-  "msg": "controller/cluster: Waiting for DockerCluster",
+  "msg": "Waiting for DockerCluster",
+  "cluster": {
+     "name": "capi-quickstart",
+     "namespace": "default"
+   },
-  "reconciler group": "cluster.x-k8s.io",
-  "reconciler kind": "Cluster",
+  "controllerGroup": "cluster.x-k8s.io",
+  "controllerKind": "Cluster",
+  "reconcileID": "56e044eb38fb76741ffe41cd83e7bed5",

The ideas behind that are:

Drop the message prefix

  • I think the message prefix in every message is not necessary and just makes the logs unnecessarily verbose.
  • With structured logging it's easily possible to just filter on / show the controller k/v pair instead.

Log "<kind>": "<namespace>/<name>" instead of "namespace" and "name"

Add a reconcileID

With a reconcileID it is very easy to identify logs belong to the same reconcile operation.

Change reconciler kind/ group to controllerKind and controllerGroup

Given that Reconciler only seems to be the "inner" reconciler provided by the user, I think we should rather use controller (e.g. we're setting the name and object on the NewControllerManagedBy Builder )

Also to align with "casing" best practices outlined here: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/migration-to-structured-logging.md#name-arguments

I've opened a PR to show how a potential solution would look like: ⚠ logging: align to Kubernetes structured logging, add reconcileID

P.S.

  • This is an alternative to: Enable per-reconcile logger customization #1811 where we considered making logging customizable.
  • I know I made a related change recently to just add the controller k/v pair. We discussed this a bit and now want to propose to align CR logging overall.
@sbueringer
Copy link
Member Author

@vincepri
Copy link
Member

vincepri commented Mar 4, 2022

+1 from me to standardize and be consistent with upstream Kubernetes

@alvaroaleman
Copy link
Member

I am not a fan of the nested namespace and name, that makes querying harder. What is the advantage of that?

@sbueringer
Copy link
Member Author

sbueringer commented Mar 4, 2022

To be honest, I'm not sure. I just saw in the Kubernetes structured logging KEP that they are rendering it like that in the JSON log format (and thus they codified it accordingly in klog).
For us in our PoCs it's not a problem as we can just parse and transform the nested object with promtail. (so eventually we'll end up with namespace/name again)

I don't really have a strong opinion on having the nested object, the only argument from my side is "consistency would be nice".

@pohly If you have some time to chime in, do you know by any chance why that decision was made?

@alvaroaleman
Copy link
Member

alvaroaleman commented Mar 4, 2022

Maybe the reason is to allow finding all logs related to a given object, which might not necesarily come from the controller that owns it. For example, a deployment controller will create a replicaset and the replicaset controller pods and if you want to find all logs related to a given replicaset, you can filter for replicaset.namespace == X && replicaset.name =Y and you would get logs from both controllers.

Maybe we can simply do both the nested/typed namespace+name and a top-level namespace+name?

@pohly
Copy link

pohly commented Mar 4, 2022

If you have some time to chime in, do you know by any chance why that decision was made?

The decision to use {"name":"...","namespace":"..."} instead of a "<namespace>/<name>" string pre-dates my involvement with structured logging. But at first glance it makes sense to me: JSON is able to represent objects and I am assuming that whatever tools are able to parse JSON then also will have the ability to access the fields on those objects. That seems more flexible than storing just a string, because such a string would need to be parsed when the individual fields are needed.

@alvaroaleman
Copy link
Member

@pohly the question was why the namespace and name are in a subobject for the type, i.E.

"cluster": {"namespace":..., "name":...}

rather than a toplevel namespace and name field (which would then be common for all logs)

@pohly
Copy link

pohly commented Mar 4, 2022

Because the {"namespace":..., "name":...} part is the value for a certain key, in this case cluster. There is no support for attaching values like that directly to the top level message, i.e. for recording a value it without a key.

@sbueringer
Copy link
Member Author

sbueringer commented Mar 7, 2022

The question was why the namespace and name are in a subobject for the type

Oh I misunderstood the question. I thought it was about having namespace and name separately.

But as you said, having them in a nested object has the advantage that we can filter them across controller logs.
E.g. in CAPI we have the following resources: Cluster, DockerCluster, Machine, DockerMachine (and more). Controllers usually reconcile a specific resource but this resource is related to a few other resources. So we add all of them as nested k/v pairs to the log.

By having namespace/name for all of them nested we can simply filter with cluster.name = "<cluster-name>" across logs of multiple controllers. Otherwise it would be very hard to take the namespace and name from every controller correlate them to the kind they belong to and then filter across controllers.

Because the {"namespace":..., "name":...} part is the value for a certain key, in this case cluster. There is no support for attaching values like that directly to the top level message, i.e. for recording a value it without a key.

It is possible, we currently do this in controller-runtime, but we don't use klog.Obj to do it. We just simple add one top-level "namespace": "<object-namespace>" and one top-level "name": "<object-name>" k/v pair.

Maybe we can simply do both the nested/typed namespace+name and a top-level namespace+name?

Sounds like a good compromise if we don't mind the duplicate information. For me it would be fine.

@pohly
Copy link

pohly commented Mar 7, 2022

It is possible, we currently do this in controller-runtime, but we don't use klog.Obj to do it. We just simple add one top-level "namespace": "" and one top-level "name": "" k/v pair.

That's a bit different because you now add two different key/value pairs instead of a single one. The downside is that plain-text output becomes a bit more verbose. Passing the result of KObj has the advantage that the logging backend can pick a suitable representation for the value.

@sbueringer
Copy link
Member Author

sbueringer commented Mar 8, 2022

I've updated the issue description to also align reconciler group/kind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants