Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CreateOrPatch / CreateOrUpdate do not set status properly when the object is created #2627

Open
jfremy opened this issue Dec 21, 2023 · 18 comments · May be fixed by #2631
Open

CreateOrPatch / CreateOrUpdate do not set status properly when the object is created #2627

jfremy opened this issue Dec 21, 2023 · 18 comments · May be fixed by #2631
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@jfremy
Copy link
Contributor

jfremy commented Dec 21, 2023

When using the CreateOrPatch or CreateOrUpdate methods, the status of the object won't be persisted if the object does not exist.

In both functions:

The method returns after the Create call.

However, this does not set the status of the object (see https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#status-subresource for instance PUT/POST/PATCH requests to the custom resource ignore changes to the status stanza.)

The code needs to be updated to actually do the status update part even on creation. I'll have a look at it and make a proposal for the change

@jfremy
Copy link
Contributor Author

jfremy commented Dec 22, 2023

In the case of CreateOrUpdate, it actually never update the status so one could argue, it just needs to be documented.
But CreateOrPatch does update the status if the object already exists so for consistency sake, it should also handle it during a create.

@troy0820
Copy link
Member

/kind support

I’m wondering if this is by design. When creating an object with the client.Client, the status doesn’t get persisted because it doesn’t use a SubResourceClient to creat the object.

Creating an object shouldn’t have the status be carried with the object. The fake client tries to mimic this behavior for unit tests.

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Dec 28, 2023
@jfremy
Copy link
Contributor Author

jfremy commented Dec 29, 2023

/kind support

I’m wondering if this is by design. When creating an object with the client.Client, the status doesn’t get persisted because it doesn’t use a SubResourceClient to creat the object.

Creating an object shouldn’t have the status be carried with the object. The fake client tries to mimic this behavior for unit tests.

Agreed for the CreateOrUpdate call - it never updates the status - so for that call, I think it might be worth calling it out in the documentation and be done with it.

But the CreateOrPatch call has explicit code to support patching the status in the object update case. It does make 2 patch calls, one to the resource and one to the sub-resource.
So there is an actual intent to support patching the status and code written specifically for this.

It currently just does not run that code path when creating the object

And since the mutate function is not supposed to know whether we’re creating or patching, I would lean toward patching the status in the same way the existing object path does (this is what the patch I proposed does, it uses the status patch code path if needed for both existing and new object.

You can use the operation result to know if an object was created and requeue (this is the “workaround” I used to handle this behavior), but the fact that we patch the status in one case and not in the other seems very counter intuitive.

@jfremy
Copy link
Contributor Author

jfremy commented Dec 29, 2023

/kind support

I’m wondering if this is by design. When creating an object with the client.Client, the status doesn’t get persisted because it doesn’t use a SubResourceClient to creat the object.

Creating an object shouldn’t have the status be carried with the object. The fake client tries to mimic this behavior for unit tests.

And just to make sure there is no confusion, I’m talking about the CreateOrPatch helper function from controllerutil. Not the patch method of the Client object

@troy0820
Copy link
Member

But the CreateOrPatch call has explicit code to support patching the status in the object update case. It does make 2 patch calls, one to the resource and one to the sub-resource.
So there is an actual intent to support patching the status and code written specifically for this.

It currently just does not run that code path when creating the object

I believe that’s intentional from a design perspective. Patching would patch both the object spec and the status if and only if the object already exists .

Create would only create the object and not the status. The same logic is said to be used for create or update, from how it handles what exists and what doesn’t.

I wouldn’t expect createorpatch to act differently than what we agreed create or update to do.

And since the mutate function is not supposed to know whether we’re creating or patching, I would lean toward patching the status in the same way the existing object path does (this is what the patch I proposed does, it uses the status patch code path if needed for both existing and new object.

We would be patching against an object that doesn’t exist to deal with the create as you suggest. Making a patch against an object that doesn’t exist doesn’t settle the creation.

@jfremy
Copy link
Contributor Author

jfremy commented Jan 2, 2024

Quote from the “documentation” from CreateOrPatch:

CreateOrPatch creates or patches the given object in the Kubernetes cluster. The object's desired state must be reconciled with the before state inside the passed in callback MutateFn.

The MutateFn is called regardless of creating or updating an object.

It is essentially saying it will use whatever calls are needed depending on the state of the resource.

Since it does support patching the status in the case the object already exists (and it goes out of its way to do it), I really think that for consistency sake, it makes sense to also support setting the status in the create path.
It works the exact same way as the “updating an object” path with the only difference that the previous state is an empty object instead of being the existing object.

My patch just re-use the existing code with an empty object as the original state. From there we can derive a status patch based on the change made in mutateFn (again existing code), and ultimately patch the status if needed.

If you still don’t think this makes sense, please feel free to close the issue

@troy0820
Copy link
Member

troy0820 commented Jan 2, 2024

The CreateOrUpdate has the same documentation. So this may be a copy/paste?

CreateOrUpdate creates or updates the given object in the Kubernetes cluster. The object's desired state must be reconciled with the existing state inside the passed in callback MutateFn.
The MutateFn is called regardless of creating or updating an object.
It returns the executed operation and an error

In trying to keep the behaviors consistent with what the verbs do, I would assume what you propose would have to be done on create or update as well and change that behavior.

We seem to have agreed in this issue that updating on an object that doesn’t exist wouldn’t settle the status on an object on the create case.

So why would we settle on the status being carried on the create case on createOrPatch?

  1. CreateOrUpdate doesn’t update the status on create
  2. CreateOrPatch doesn’t update the status on create but only when patched. (It uses the clients subresourcesWriter to do this)

If we want to carry the status on creation of an object, I would contend that this behavior should be consistent in both functions like they are today or both changed to their respective functions.

However, I don’t believe we want to conflate the creation of objects as being something that is also setting the status on the object from creation.

@jfremy
Copy link
Contributor Author

jfremy commented Jan 3, 2024

CreateOrUpdate won’t update the Status ever. That was incorrect in my initial statement.
It will only ever update the resource but not the sub resource.

CreateOrPatch behaves differently from CreateOrUpdate in that respect in that it will update the status when the object already exists

@jfremy
Copy link
Contributor Author

jfremy commented Jan 3, 2024

The CreateOrUpdate code is essentially:

- get object
- if does not exist, run mutate and create with object after mutate
- if exists, run mutate, check if there was a change, update with object after mutate

The current CreateOrPatch code is the following:

- get object
- if does not exist, run mutate then create with object after mutate
- if exists
  - create original copy of current object (will be used as source for patch)
  - run mutate
  - if diff on resource, run patch using diff on resource vs original copy
  - if diff on status, run patch using diff on resource status vs original copy

As you can see those are quite different (I’ve slightly simplified the patch case for brevity)

@jfremy
Copy link
Contributor Author

jfremy commented Jan 3, 2024

I definitely agree CreateOrUpdate should be left as is - probably worth adding a note stating it does not touch the status

@troy0820
Copy link
Member

troy0820 commented Jan 3, 2024

CreateOrUpdate won’t update the Status ever. That was incorrect in my initial statement.
It will only ever update the resource but not the sub resource.
CreateOrPatch behaves differently from CreateOrUpdate in that respect in that it will update the status when the object already exists

We agree. I’m still not clear on this:

So why would we settle on the status being carried on the create case on createOrPatch?

In your above example, both functions even after they mutate do not do anything to the status of the object being created in the create case.

Only after patch (still a case for update to do this) if the object exists would we patch the status because the existing object is now different from what was fetched.

My case is to express that we should probably not be setting the status on the create case for several reasons beyond the subresource not being invoked to do so on either case during creation.

If we need to modify the update case in createOrUpdate to mimic the behavior of createOrPatch to satisfy the consistency, and update the status during update, that seems reasonable and not out of bounds for this discussion.

@jfremy
Copy link
Contributor Author

jfremy commented Jan 3, 2024

That’s interesting. You’re considering parity between the two methods being the more important while I am actually considering that parity inside a method is more important and was not utterly worried having a difference of behavior between the two (though the behavior of each should be documented).

What are your concerns regarding patching the status “during” the create? (technically it will be after the create through another patch call).
In the patch case at least, I fail to see anything that would not work at that stage but would work when the object exists.
The mutateFn will modify the parts that it would modify anyway if the object existed and only those will be changed

@troy0820
Copy link
Member

troy0820 commented Jan 3, 2024

That’s interesting. You’re considering parity between the two methods being the more important while I am actually considering that parity inside a method is more important and was not utterly worried having a difference of behavior between the two (though the behavior of each should be documented).

Both functions create or update/patch the object. I would assume that the create case is the same for both functions (just create the object and not do anything to the status). The distinction isn’t in the logic but how the behavior exists today where the gap isn’t in the create case to settle the status on creation.

Create makes an event which then gets reconciled to have the Reconciler update its status dependent on its conditions gathered by the state of the world the controller “reconciles” the object. Providing that status on creation seems counter to that logic. Especially if it’s for one function and not both.

What are your concerns regarding patching the status “during” the create? (technically it will be after the create through another patch call).
In the patch case at least, I fail to see anything that would not work at that stage but would work when the object exists.
The mutateFn will modify the parts that it would modify anyway if the object existed and only those will be changed

Settling status with create can do things unintended for state of the object.

If you were to create a pod and provide the status where the pod is running, that doesn’t reflect the lifecycle of the pod until that event occurs and it’s settled by its own controller.

Making another call to patch the status after you create the object doesn’t sound like the intention of the function.

If you need to patch the status of the created object, you could call it again after it’s been created and go down the existed path.

@jfremy
Copy link
Contributor Author

jfremy commented Jan 3, 2024

This would only happen if the user actually changed anything in the status (you’ll see that the existing code checks if changes were made before issuing any of the patch / update calls). Assuming the user does not change anything, it won’t make any call against the status. So it only happens if there is a clear intent from the user to make a change.

I ran into this issue with a kind of special case controller. It creates CRD instances that mirror the state of an external system for consumption by other controllers.
In this case, the controller that does create the object also “owns” the status of the object (mirrors the "external status") and was setting status at creation time.
I “fixed” it by re-queuing the object on create operation result so it’s definitely not insurmountable.

So since the code only issues calls if there is actually a change made (change made inside MutateFn), I felt it would make sense to support it, even more-so since the MutateFn is not supposed to know if it’s in the existing / new object path (other than looking into object to see if there is a uuid or anything that will be set after creating it - but those kind of indirect guesses don’t feel right)
I understand it is an edge case.

If anything, I think the documentation should be improved to highlight what can / can’t be updated and in what circumstances. I can suggest some changes

Regarding the case of CreateOrUpdate, I lack the context as to why updating the status was not done.

@troy0820
Copy link
Member

troy0820 commented Jan 3, 2024

This would only happen if the user actually changed anything in the status (you’ll see that the existing code checks if changes were made before issuing any of the patch / update calls). Assuming the user does not change anything, it won’t make any call against the status. So it only happens if there is a clear intent from the user to make a change.

When creating an object, status doesn't get carried to the object because two different writers are responsible for dealing with the request. One request (the creation of the object, only deals with the spec and the status is dealt with a status writer which writes to a different endpoint).

We agree that a call to CreateOrUpdate or CreateOrPatch will handle the create case and not do anything to the status. (It shouldn't by the documentation linked and how object creates are settled)

You specified in this issue that the behavior for creating an object ignores the changes to the status stanza.

The method returns after the Create call.
However, this does not set the status of the object (see https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#status-subresource for instance PUT/POST/PATCH requests to the custom resource ignore changes to the status stanza.)

I would also highlight this:

PUT requests to the /status subresource take a custom resource object and ignore changes to anything except the status stanza.
PUT requests to the /status subresource only validate the status stanza of the custom resource.

I ran into this issue with a kind of special case controller. It creates CRD instances that mirror the state of an external system for consumption by other controllers.
In this case, the controller that does create the object also “owns” the status of the object (mirrors the "external status") and was setting status at creation time.
I “fixed” it by re-queuing the object on create operation result so it’s definitely not insurmountable.

This sounds like a design decision outside of the scope of what is intended with the function. If you are utilizing external objects to replicate the objects so that the controller can do things with it, I would ask why not use a structure that will not recreate external objects but utilize the external object information for consumption?

If you need to recreate the objects and have to carry the status, I suggest that you create the object, then patch the object's status in two separate calls, rather than conflating the functions intention around a use case that's not built with what the api or the documentation provide.

So since the code only issues calls if there is actually a change made (change made inside MutateFn), I felt it would make sense to support it, even more-so since the MutateFn is not supposed to know if it’s in the existing / new object path (other than looking into object to see if there is a uuid or anything that will be set after creating it - but those kind of indirect guesses don’t feel right)
I understand it is an edge case.
If anything, I think the documentation should be improved to highlight what can / can’t be updated and in what circumstances. I can suggest some changes

From what you are experiencing, it seems what you are asking for is to have this function provide a way to copy the status of an object during creation when neither function does that for create. Doing so will not only break the behavior of what is described by the api and documentation but also create a case where we assume that every object being created has a subresource needed to copy the status on creation.

Regarding the case of CreateOrUpdate, I lack the context as to why updating the status was not done.

If we need to update the CreateOrUpdate update path to deal with the status there, that seems reasonable.

@jfremy
Copy link
Contributor Author

jfremy commented Jan 4, 2024

I still think there is a disconnect and I’m not explaining myself correctly but I don’t want to waste more of your time.
I’ve create a separate PR that adds notes to each method documentation describing what they do with respect to status (namely nothing for CreateOrUpdate and patch only if the object already exists for CreateOrPatch).
That will at least ensure that people know what to expect.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 3, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants