Skip to content

Skip variable validations during destroy apply operations #34101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 18, 2023

Conversation

liamcervante
Copy link
Member

This PR makes Terraform skip variable validations during destroy plans and applys. This makes variable validations match the behaviour of other custom conditions which also do not execute during destroy operations.

This has the side effect of fixing the crash in the linked issue, as the variable validations will no longer attempt to sometimes execute during destroy-apply operations after having been skipped during the equivalent plan operation.

Fixes #34052

Target Release

1.6.2

Draft CHANGELOG entry

BUG FIXES

  • Fix occasional crash when destroying configurations with variables containing validations.

@liamcervante liamcervante added the 1.6-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged label Oct 17, 2023
@liamcervante liamcervante requested a review from a team October 17, 2023 08:29
@apparentlymart
Copy link
Contributor

The case where an input variable is used to configure one of the providers used to destroy makes me a little concerned about this, since anyone using validation rules as a guardrail against dangerous provider configurations would be surprised if those guardrails were silently ignored during destroy.

I see that there is some care for that in the code here but it isn't clear to me exactly what the behavior is. Do the validation rules still prevent a downstream provider configuration from being considered if it refers to invalid input variables? I think ideally I'd like to see it still do the checks and return error diagnostics in the destroy phase, even if we skip reporting the results via the "checks" concept (since writing the check results to state after a destroy would be a strange thing to do then everything else in the state has been deleted anyway).

I suppose it's true that we already made an exception for situations where a provider configuration refers to a resource that has a precondition or postcondition that is failing, but using input variables to for provider configurations is more common and I think it's more likely that someone would use them for something that is explicitly intended to be a provider configuration guardrail, like making sure that an appropriate region has been selected when running in a particular environment, or that endpoint overrides are only allowed in a development environment, etc.

@jbardin
Copy link
Member

jbardin commented Oct 17, 2023

Taking @apparentlymart's comment and mine together, maybe the goal here is to avoid registering "invalid" checks, rather than skipping them entirely? I'm not sure exactly what that means yet, but did notice the failing checks all had that same zero-value state.

@liamcervante
Copy link
Member Author

So, the initial statuses are set here: https://github.com/hashicorp/terraform/blob/main/internal/terraform/context_walk.go#L127-L138. Which internally calls this: https://github.com/hashicorp/terraform/blob/main/internal/checks/state.go#L86.

We could add a bit of logic in there that doesn't do that for destroy only plans but we'd still need all this PR to also tell the input validations to not execute anyway. They don't actually check if a validation has been registered, just assume it has been and then try to report it. We'd just see the same panic we are seeing now.

The ObjectAddrsKnown check (from the first link) means it only reports checks that successfully executed during the plan, and because the plan is pruning variable nodes when the apply isn't we get the panic. Essentially the current behaviour is because the validations aren't being registered properly anyway, so just skipping registering them would have the same effect.

I think I could implement @apparentlymart's suggestion by keeping the logic as is but pushing the if destroying { check around this block instead:

if result.Status == checks.StatusFail {
checkState.ReportCheckFailure(addr, addrs.InputValidation, ix, result.FailureMessage)
} else {
checkState.ReportCheckResult(addr, addrs.InputValidation, ix, result.Status)
}

@liamcervante
Copy link
Member Author

liamcervante commented Oct 17, 2023

FWIW, in a normal destroy operation it actually does a pre-destroy refresh plan first and the input validations do actually still execute during that part of the check. So an invalid input variable would still be caught during that stage. You can skip that if you tell it to skipRefresh, so maybe we can just do the above suggestion?

See here:

refreshPlan, refreshDiags := c.plan(config, prevRunState, &refreshOpts)

@apparentlymart
Copy link
Contributor

In today's Terraform we always require that the root input variables set during plan will propagate exactly to the apply phase, and so we can assume that validations of those will be checked during planning and will thus block applying the plan.

Unfortunately the story isn't as neat for child module input variables, because they can be unknown during planning and given their final definition only during the apply step. That means we'd set the check result to "unknown" during planning and then only decide the final verdict during apply.

Since we're only talking about the destroy phase and provider configurations though, I think we can probably get away with this loophole: destroy ignores most of the configuration anyway -- today it only evaluates provider blocks and provisioner blocks that have when = destroy set. And during destroy, everything a provider block depends on will have a known value because anything that can produce values is resolved only from the state during destroy, and the state can never contain unknown values.

Therefore I think the fact that we should be able to catch it during planning, combined with the fact that the state can only change in very limited ways during a destroy operation (and in particular can't introduce new unknowns), does seem to address what I was concerned about. I don't think we need to tighten this loophole as long as it only happens during apply and only when doing a "full destroy".

Verified

This commit was signed with the committer’s verified signature.
bflad Brian Flad
@liamcervante
Copy link
Member Author

Okay, so the latest changes mean the validations are skipped during the apply stage of a destroy operation only.

@liamcervante liamcervante merged commit e08d296 into main Oct 18, 2023
@liamcervante liamcervante changed the title Skip variable validations during destroy plans Skip variable validations during destroy apply operations Oct 18, 2023
@liamcervante liamcervante deleted the liamcervante/34052 branch October 18, 2023 14:49
@github-actions
Copy link
Contributor

Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch.

cam72cam added a commit to opentofu/opentofu that referenced this pull request Nov 6, 2023

Verified

This commit was signed with the committer’s verified signature.
bflad Brian Flad
The original functionality of not validating during "apply" was added
in c9bc7e8.  This unfortunately caused destroy to break as the
validation checks were not pre-populated.

The change adopted by Terraform was to only run the validations if the
action is not destroy: hashicorp/terraform#34101

I decided to take a different approach and instead remove the special
case around validating only during planning (and partially during apply.
Additional calls into validation are cached during the plan -> apply
cycle and don't incur a dramatic performance penalty.  I also believe
the approach is simpler and easier to understand.

Signed-off-by: Christian Mesh <christianmesh1@gmail.com>
Copy link
Contributor

github-actions bot commented Dec 7, 2023

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
1.6-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Terraform Crash: checkable object status report for unexpected checkable object
3 participants