Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dry run in admission webhooks #66936

Merged
merged 2 commits into from Aug 23, 2018

Conversation

jennybuckley
Copy link

@jennybuckley jennybuckley commented Aug 2, 2018

What this PR does / why we need it:
Follow up to #66391

  • add DryRun to admission.k8s.io/v1beta1.AdmissionReview
  • add DryRunnable to admissionregistration.k8s.io/v1beta1.(Valid|Mut)atingWebhookConfiguration
  • add dry run support to (Valid|Mut)atingAdmissionWebhook

Includes all the api-changes outlined by kubernetes/community#2387

/sig api-machinery

Release note:

To address the possibility dry-run requests overwhelming admission webhooks that rely on side effects and a reconciliation mechanism, a new field is being added to admissionregistration.k8s.io/v1beta1.ValidatingWebhookConfiguration and admissionregistration.k8s.io/v1beta1.MutatingWebhookConfiguration so that webhooks can explicitly register as having dry-run support. If a dry-run request is made on a resource that triggers a non dry-run supporting webhook, the request will be completely rejected, with "400: Bad Request". Additionally, a new field is being added to the admission.k8s.io/v1beta1.AdmissionReview API object, exposing to webhooks whether or not the request being reviewed is a dry-run.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 2, 2018
@k8s-github-robot k8s-github-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Aug 2, 2018
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 2, 2018
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 6, 2018
@jennybuckley
Copy link
Author

Rebased

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 7, 2018
@jennybuckley
Copy link
Author

@kubernetes/sig-api-machinery-api-reviews
@kubernetes/api-reviewers

@jennybuckley
Copy link
Author

@@ -73,6 +73,10 @@ type AdmissionRequest struct {
// OldObject is the existing object. Only populated for UPDATE requests.
// +optional
OldObject runtime.Object
// DryRun indicates that modifications will definitely not be persisted for this request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that the call must not have side effects.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, what about something also indicating that a value of DryRun false doesn't guarantee that the object will be persisted, so any side effects should have a reconciliation mechanism, or is that redundant because it's already somewhere in the webhook documentation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be documented, but feel free to state it again here if you think it makes it clearer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, added this

// DryRun indicates that modifications will definitely not be persisted for this request.
// Defaults to false.
// +optional
DryRun *bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to pass on the different sorts of dry run calls?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'd want to get more granular than including/excluding an entire phase of admission. That would mean a particular webhook call would only need to know if the request was dryRun or not. Does that seem sufficient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I'm not the one arguing that dryRun should be an array of strings...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was discussed on the wg-apply slack channel, and the conclusion was that giving the webhooks the array of strings would be exposing a lot of complexity to them, which they don't necessarily need. Because we don't want to allow the level of granularity to dry run with only some of the admission webhooks, anyway.

// DryRun indicates that modifications will definitely not be persisted for this request.
// Defaults to false.
// +optional
DryRun *bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really safe to make this optional? Do we need to make a completely new version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has to be optional in v1beta1, because it's a new field, and has to default to false, since we can't assume current webhooks handle dry run correctly.

the current plan (in service of pushing all admission webhooks to support dry run) is described in https://github.com/kubernetes/community/blob/master/keps/sig-api-machinery/0015-dry-run.md#admission-controllers:

If dry-run is requested on a non-supported webhook, the request will be completely rejected, as a 400: Bad Request. This field will be defaulted to true and deprecated in v1, and completely removed in v2. All webhooks registered with v2 will be assumed to support dry run.

Copy link
Member

@liggitt liggitt Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, my comment was actually around the dryRunnable bool on the webhook object. similar reasoning applies to this field though (optional additive field in an existing API version), with the exception of changing defaults and removing it in a future version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course if we add it to the existing object it has to be optional.

The comment I'm working on leaving on the other API object will explain my position.

Copy link
Member

@liggitt liggitt Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you proposing to only add dryRun to a v1 (or v1beta2, etc) AdmissionReview object, require webhooks to use the v1 format to indicate support for dry run, and treat all that use the v1beta1 format as not supporting dry run?

upsides:

  • simpler (though implicit) signal that the webhook knows about dryrun
  • no default false in v1beta1, default true in v1, remove dryRunnable in v2 path required

downsides:

  • harder for webhooks to support pre-1.12 and 1.12+ kube (they'd need distinct webhook registration manifests)
  • would require rebuilding existing webhooks that have no side effects (and so can support dry run mode without changes) to support a new version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

harder for webhooks to support pre-1.12 and 1.12+ kube (they'd need distinct webhook registration manifests)

Why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if a cluster admin wants to use dry run but some of their webhooks don't support it, they could manually add dryRunnable: true to the webhooks themself if they are willing to accept the risk in doing so

Yeah, this is a foot-gun :)

Copy link
Member

@liggitt liggitt Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

harder for webhooks to support pre-1.12 and 1.12+ kube (they'd need distinct webhook registration manifests)

Why?

They'd need to request v1beta1 AdmissionReview objects pre-1.12, and v1 AdmissionReview objects in 1.12+. I actually thought we already had the requested AdmissionReview version in the webhook registration object, but it turns out we don't.

Thinking about that a little more, we'll likely want a list of AdmissionReview versions the webhook can accept. Each apiserver, when sending an AdmissionReview object, would find the first version in the webhook's supported list it is able to send. That would let 1.12 and pre-1.12 servers both talk to the same webhook, as long as it supported both v1 and v1beta1 AdmissionReview objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a cluster admin wants to use dry run but some of their webhooks don't support it, they could manually add dryRunnable: true to the webhooks themself if they are willing to accept the risk in doing so

Yeah, this is a foot-gun :)

I don't think it's catastrophic... admission already has to tolerate and recover from being called in cases where the object isn't actually persisted to storage (because of a conflict, etc). telling the admission plugin this is a dry-run operation is largely an optimization that lets it skip eventually reconciling away a side-effect from a non-persisted request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'd need to request v1beta1 AdmissionReview ...

Webhooks don't request anything, they get what we send them. I am confused about what you mean.

apiserver can try with the new version and retry with an old version if that fails.

// will be completely rejected and the webhook will not be called.
// Defaults to false.
// +optional
DryRunnable *bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I understand that it's much more expedient to put this here, but it's IMO the wrong way to do this.

I think this should be solved by revving the admissionreview API version, and checking at runtime if the webhook understands it.

If we put this here, then:

  • administrators will be tempted to flip the bit without checking whether the webhook actually supports it (how can they even check?)
  • administrators have to flip the bit in every configuration that references the webhook.

Did we decide that the expediency outweighs these considerations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was done this way more out of consistency with how we add fields to other API types than a concern about expediency.

It doesn't look like we give webhooks a way to indicate what version of AdmissionReview they want today. As soon as we support more than one version, we need that information. I'd be on board with adding that, only adding a dryRun indicator to a new rev of AdmissionReview, and requiring webhooks to request a version of AdmissionReview that supports dryRun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jennybuckley's point about this requiring rebuilding admission plugins that currently don't have side effects is a good one. All the conversations we had around this are coming back to me now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like we give webhooks a way to indicate what version of AdmissionReview they want today.

We don't need to know this. We can try a call with the new version, if it fails, retry with the old version. We don't need a place for that in the configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jennybuckley's point about this requiring rebuilding admission plugins that currently don't have side effects is a good one.

That's still an argument from expediency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not arguing we shouldn't do the expedient thing, I just want to know that we actually considered doing the correct thing and are making a good trade off. Given that the correct thing is apparently non-obvious, maybe it is better to just do it this way :/

chakri-nelluri pushed a commit to chakri-nelluri/kubernetes that referenced this pull request Aug 7, 2018
…on-2

Automatic merge from submit-queue (batch tested with PRs 67085, 66559, 67089). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Block dry-run if a webhook would be called

**What this PR does / why we need it**:
Follow up to kubernetes#66391
Suggested in kubernetes#66391 (comment)

Makes dry-run safe in case kubernetes#66936 takes a long time to merge

**Release note**:

```release-note
NONE
```

/sig api-machinery
cc @lavalamp
k8s-publishing-bot added a commit to kubernetes/apiserver that referenced this pull request Aug 7, 2018
Automatic merge from submit-queue (batch tested with PRs 67085, 66559, 67089). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Block dry-run if a webhook would be called

**What this PR does / why we need it**:
Follow up to kubernetes/kubernetes#66391
Suggested in kubernetes/kubernetes#66391 (comment)

Makes dry-run safe in case kubernetes/kubernetes#66936 takes a long time to merge

**Release note**:

```release-note
NONE
```

/sig api-machinery
cc @lavalamp

Kubernetes-commit: 47878f2bd1642145e311d2336e3103c6edcd50de
sttts pushed a commit to sttts/apiserver that referenced this pull request Aug 8, 2018
Automatic merge from submit-queue (batch tested with PRs 67085, 66559, 67089). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Block dry-run if a webhook would be called

**What this PR does / why we need it**:
Follow up to kubernetes/kubernetes#66391
Suggested in kubernetes/kubernetes#66391 (comment)

Makes dry-run safe in case kubernetes/kubernetes#66936 takes a long time to merge

**Release note**:

```release-note
NONE
```

/sig api-machinery
cc @lavalamp

Kubernetes-commit: 47878f2bd1642145e311d2336e3103c6edcd50de
@lavalamp
Copy link
Member

lavalamp commented Aug 8, 2018

Is anything blocking this PR other than me?

@jennybuckley
Copy link
Author

@lavalamp

I would also want to see if @deads2k takes issue with anything in the API changes being made here, because on the wg-apply slack channel he said "we still get api review during pulls I presume?" before approving the KEP. I don't want to roll this back because it merged too soon without getting everyone's input

@lavalamp
Copy link
Member

lavalamp commented Aug 8, 2018 via email

@jennybuckley
Copy link
Author

@lavalamp besides the doc change you pointed out, is there anything in this that you think needs to be changed before you would approve this?

@lavalamp
Copy link
Member

lavalamp commented Aug 8, 2018

Let's proceed as if this is fine. I'll get a 2nd opinion in parallel. Yes, if we're going to do it this way, this is good with the doc change.

@@ -98,8 +98,9 @@ func (d *validatingDispatcher) Dispatch(ctx context.Context, attr *generic.Versi

func (d *validatingDispatcher) callHook(ctx context.Context, h *v1beta1.Webhook, attr *generic.VersionedAttributes) error {
if attr.IsDryRun() {
// TODO: support this
return webhookerrors.NewDryRunUnsupportedErr(h.Name)
if h.SideEffects == nil || *h.SideEffects == v1beta1.SideEffectClassUnknown || *h.SideEffects == v1beta1.SideEffectClassSome {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here... requiring a positive match on an enum value we understand that means dry run is safe seems better

// Requests with the dryRun attribute will be auto-rejected if they match a webhook with
// sideEffects == Unknown or Some. Defaults to Unknown.
// +optional
SideEffects *SideEffectClass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did validation get added for this field?

@@ -35,4 +35,8 @@ func SetDefaults_Webhook(obj *admissionregistrationv1beta1.Webhook) {
selector := metav1.LabelSelector{}
obj.NamespaceSelector = &selector
}
if obj.SideEffects == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth adding a TODO here (or in a "v1 graduation criteria" list if we have one) to revisit/remove this default and possibly make the field required when promoting to v1, so we don't forget

@jennybuckley
Copy link
Author

Addressed comments and squashed

@jennybuckley
Copy link
Author

/retest

@lavalamp
Copy link
Member

Looks like comments have been addressed.

/lgtm
/approved

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 22, 2018
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2018
@jennybuckley
Copy link
Author

Just had to rebase, hopefully everything still passes

@liggitt
Copy link
Member

liggitt commented Aug 23, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 23, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jennybuckley, lavalamp, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 67576, 66936). If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants