Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It is too easy to create duplicate caches in different projections #1660

Closed
maelvls opened this issue Sep 10, 2021 · 12 comments
Closed

It is too easy to create duplicate caches in different projections #1660

maelvls opened this issue Sep 10, 2021 · 12 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@maelvls
Copy link
Contributor

maelvls commented Sep 10, 2021

Since v0.7.0, it has become possible to use a "metadata-only" cache (#1174). One caveat that is (I think) not highlighted enough in the Go comments is that you may end up writing racy code when mixing multiple types of clients.

For example, the following snippet watches the same type (v1.Pod) using two watchers:

  • a metadata-only client used by the watcher,
  • a concrete client used in the reconciliation function.

A simple use-case could be the one of reconciling ReplicaSets whenever a Pod changes.

ctrl.NewControllerManagedBy(mgr).
	For(&appsv1.ReplicaSet{}).
	Watches(
		&source.Kind{Type: &corev1.Pod{}},
		&handler.EnqueueRequestForOwner{OwnerType: &corev1.ReplicaSet{}, IsController: true},
		builder.OnlyMetadata, // 🔥
	).
	Complete(reconcile.Func(func(_ context.Context, r reconcile.Request) (reconcile.Result, error) {
		// This is the ReplicaSet reconciliation function. Let's imagine that a pod
		// has to be fetched in this reconciliation.

        nsAndName := findPodAssociatedWithRS(r.NamespaceAndName)
		pod := &corev1.Pod{}
		_ = mgr.GetClient().Get(context.Background(), nsAndName, pod) // 🔥

        // ...
	}))

When a Pod changes, the apiserver sends a MODIFIED event twice since two watchers are running (one is watching metav1.PartialMetadataObject, the other v1.Pod). This means that the metadata cache may get updated first, triggering the reconciliation, while the second has not been propagated.

This race may lead to a "not found" error in the Get call above. The issue #1454 also talks about this unexpected behavior.

Update: I had no motivation or purpose in mixing the metadata and concrete projections. I did not know that I was mixing things. I just wanted to use the medatada projection.

Does it make sense to document this caveat?

@alvaroaleman
Copy link
Member

When a Pod changes, the apiserver sends a MODIFIED event twice since two watchers are running (one is watching metav1.PartialMetadataObject, the other v1.Pod)

Sorry, I don't follow. The code you are showing results in two watches, one for ReplicaSets and one for Pods in the metadata projection, but not in two watches for pods in different projections.

I am not really sure what you are trying to do here. Yes, resources in the apiserver are independent of each other which includes watches, so no one can guarantee you that if you observe a change in resource A (say replicaset) you are also observing an expected change in resource b (say pods).

Watching the same api in different projections is technically possible and it can not be guaranteed that they will be identical at all times but there really is no good reason to ever do that.

@maelvls
Copy link
Contributor Author

maelvls commented Sep 12, 2021

I realize that the example with pods and replicasets does not a good job at showing the "race" issue. @wallrj and @irbekrm helped me build a minimal working example that would show the "race" issue; the code to reproduce the below example is available in https://github.com/maelvls/controller-runtime-cache-race.

In this example, I create a controller that adds the secret-found: "yes" annotation to the Secret when it does not already exist:

+-----------------------+
| kind: Secret          |
| metadata:             |
|   name: example-1     |
|   annotations: []     |
+-----------------------+
            |
            |
            | reconciliation
            |
            v
+-------------------------+
| kind: Secret            |
| metadata:               |
|   name: example-1       |
|   annotations:          |
|     secret-found: "yes" |
+-------------------------+

In normal circumstances (no race), we can see in the logs that the two reflectors (v1.Secret and v1.PartialObjectMetadata) receive the ADDED event simultaneously, which means the client.Get call is able to fetch the Secret:

    main_test.go:153: Create Secret secret-1 in namespace ns-1 owned
    main_test.go:197: : POST https://127.0.0.1:37589/api/v1/namespaces/ns-1/secrets 201 Created in 3 milliseconds
    main_test.go:197: : reflector: event ADDED (1 -> 159)
    main_test.go:197: : reflector: event ADDED (1 -> 159)
    main_test.go:197: secret-reconciler: start: secret="ns-1/secret-1"
    main_test.go:197: : GET https://127.0.0.1:37589/api/v1/namespaces/ns-1/secrets/secret-1 200 OK in 0 milliseconds
    main_test.go:197: : reflector: event MODIFIED (159 -> 160)
    main_test.go:197: : PUT https://127.0.0.1:37589/api/v1/namespaces/ns-1/secrets/secret-1 200 OK in 1 milliseconds
    main_test.go:197: secret-reconciler: end: secret="ns-1/secret-1"

When a race occurs, the two reflectors do not process (or receive, I'm not sure) the event ADDED at the same time; the client.Get cache may not be ready when the reconciliation happens:

    main_test.go:153: Create Secret secret-1 in namespace ns-1 owned
    main_test.go:197: : POST https://127.0.0.1:37809/api/v1/namespaces/ns-1/secrets 201 Created in 2 milliseconds
    main_test.go:197: : reflector: event ADDED (1 -> 159)
    main_test.go:197: secret-reconciler: start: secret="ns-1/secret-1"
	main_test.go:197: secret-reconciler: secret not found: secret="ns-1/secret-1"
    main_test.go:197: secret-reconciler: end: secret="ns-1/secret-1"
    main_test.go:197: : reflector: event ADDED (1 -> 159)

Watching the same API in different projections is technically possible and it can not be guaranteed that they will be identical at all times but there really is no good reason to ever do that.

I totally agree, I don't think there is a good reason to use two different projections for the same type.

My point is rather that there should be more warnings in the documentation about the issues that you may encounter when mixing up projections for the same type, and that it is unsupported by controller-runtime. @wallrj and I wrote a controller that was mixing the metadata and "concrete" projections for the Secret resource, and the only reason we know this is because we have this "race" in our integration tests.

@alvaroaleman
Copy link
Member

My point is rather that there should be more warnings in the documentation about the issues that you may encounter when mixing up projections for the same type

It is still unclear to me what motivation could be there for doing that. It is wasteful for the apiserver and your component, you are essentially requesting the same data multiple times (in one case only a subset of it). What is your reason for doing that?

I don't really see a point in warning ppl of an issue they might encounter after creating duplicate caches in different projections, because that shouldn't be done in the first place. If anything, we should make it harder to create duplicate caches.

@maelvls
Copy link
Contributor Author

maelvls commented Sep 13, 2021

We had no motivation or purpose in mixing the metadata and concrete projections. We did not know that we were mixing things, we just wanted to use the medatada projection.

We did not realize that the line:

_ = mgr.GetClient().Get(context.Background(), r.NamespaceAndName, secret)

meant that another cache would get created. The symptom of this secondary cache is a race.

I don't really see a point in warning ppl of an issue they might encounter after creating duplicate caches in different projections, because that shouldn't be done in the first place. ​If anything, we should make it harder to create duplicate caches.

I agree. Users should be aware when they create two caches and that having two caches of the same object on different projections is not supported by controller-runtime (because of the cache duplication that it creates as well as the risk of data races).

How can I help?

@alvaroaleman
Copy link
Member

We had no motivation or purpose in mixing the metadata and concrete projections. We did not know that we were mixing things, we just wanted to use the medatada projection.

Ah ok, that is the bit I was missing :)

/retitle It is too easy to create duplicate caches in different projections

So at the end of the day this is a programmer error and not something we can detect at compile time. We could add a check for this into pkg/cache and error out if someone requests something that we already have in a different projection?

@k8s-ci-robot k8s-ci-robot changed the title Reflector "race" when using different clients (unstructured vs. metadata vs. concrete client caches) It is too easy to create duplicate caches in different projections Sep 13, 2021
@FillZpp
Copy link
Contributor

FillZpp commented Sep 14, 2021

There is another similar condition when someone tries to get/list/watch both typed and unstructured objects with the same GVK.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 13, 2021
@maelvls
Copy link
Contributor Author

maelvls commented Dec 14, 2021

We could add a check for this into pkg/cache and error out if someone requests something that we already have in a different projection

I'd be happy to help add this check, I will open a PR.

@maelvls
Copy link
Contributor Author

maelvls commented Dec 14, 2021

/remove-lifecycle stale

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 14, 2022
@maelvls
Copy link
Contributor Author

maelvls commented Mar 14, 2022

Fixed in #1747.

/close

@k8s-ci-robot
Copy link
Contributor

@maelvls: Closing this issue.

In response to this:

Fixed in #1747.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants