Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up handling of check-related graph nodes #32051

Merged
merged 16 commits into from Oct 20, 2022

Conversation

jbardin
Copy link
Member

@jbardin jbardin commented Oct 20, 2022

This covers a number of various loosely related issues all stemming from the condition checks handling in outputs and NoOp resource nodes:

  • Optimally the apply graph operation should not need extra knowledge from how the plan was created, as normal, refresh, or destroy plan. The serialized plan however does not have the information necessary to make this a reality. With the addition of conditions on outputs, output nodes need to be aware of the current phase of operation to know whether to register, evaluate, or ignore the conditions in the config. Resource evaluation also requires knowing if a full destroy in effect to determine how to evaluate resources pending removal in a full destroy when a provider required them for configuration. The existing workaround of checking for only Delete actions no longer works, and we must again depend on the walkOperation. Add the plan's walkOperation via the plan to the graph builder, and make use of it as necessary in the various transformers. If we aim for a change set which retains the full fidelity of all objects used during the plan, then we can review the removal of the walkOperation based logic.

  • Output conditions should not be checked during a full destroy. The conditions may no longer be satisfied or even valid due to the overall destroy. We can use the new walkOperation information to determine when checks can be skipped. One thing which is not clear here is whether condition references should be skipped during destroy as well. It seems that they should be fine as the graph must have been correct initially, but because they don't participate in the data flow, there is chance that they could interfere with a full destroy which involves intermediary providers.

  • Replacing the root outputs with an expansion node was not complete, and singular nodes could have been inserted skipping the registrations of the condition checks. Outputs are the only type which accounts for most types of actions during all phases with a single node, so nodeExpandOutput must have all the same graph-building information to make the decision about what node types to return and how to evaluate checks.

  • Checks are not needed during console eval, but rather than inserting another exception path for their evaluation, we just make sure that they are registered to prevent panics.

  • The destroy plan must refresh before creating the destroy graph, but was using a normal plan to do so. We can switch that to a RefreshOnly plan to prevent the unnecessary evaluation and planning of resources.

  • NoOp resource nodes should not be participating in destroy sequences. Now that they are always included in the graph, we need to check the change action to make sure not to introduce unneeded edges into the sequence. This is again especially important when destroy sequences have intermediary providers which are not at the graph root.

Fixes #31975
Fixes #31994
Fixes #32006
Fixes #32023

Module output may need to be evaluated during destroy in order to
possibly be used by providers. The final state however is that all
objects are destroyed, so preconditions should not be evaluated.
Not all root output instances were going through proper expansion when
destroy operations were involved, leading to cases where they would be
evaluated even though the expected result was only to remove them from
the state.

Normally destroy nodes stand alone in the graph, and do not produce
references to other nodes. Because root output nodes were replaced by
expansion nodes, these were being connected via normal references, even
in the case where we were working with a destroy graph.
IsFullDestroy was a workaround during apply to detect when the change
set was created by a destroy plan. This no longer works correctly, and
we need to fall back to the UIMode set in the plan.
The removeRootOutputs field was not strictly used for that purpose, and
was also copied to another DestroyPlan field.
Refreshing for a destroy should use the refresh-only plan to avoid
planning new objects or evaluating conditions. This should also be
skipped if there is no state, since there would be nothing to refresh.
NoOp changes should not participate in a destroy sequence, but because
they are included as normal update nodes the usual connections were
still being made.
@jbardin jbardin added the 1.3-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged label Oct 20, 2022
@jbardin jbardin requested a review from a team October 20, 2022 15:48
@jbardin jbardin self-assigned this Oct 20, 2022
Copy link
Member

@apparentlymart apparentlymart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging in to all of this!

As you know, I am a non-fan of anything which makes terraform destroy behave differently than a terraform apply with a desired state that contains no resource instances, but I also accept that this is a pragmatic solution to an existing design problem that we're not going to be able to address in the v1.3 series.

If you have some ideas about what additional finer-grain information we'd need to track in individual entries of plan.Changes to avoid the need for this global flag, it'd be nice to capture that information somewhere so we'll have a starting point for subsequent work to address this tech debt. I expect it's not something we'll be able to jump into immediately, but ideally I'd like to capture whatever context you loaded in to your brain while thinking about this so that we have something to start from in subsequent design work.

internal/terraform/context_plan.go Outdated Show resolved Hide resolved
Comment on lines +116 to +123
// FIXME: Due to differences in how objects must be handled in the
// graph and evaluated during a complete destroy, we must continue to
// use plans.DestroyMode to switch on this behavior. If all objects
// which require special destroy handling can be tracked in the plan,
// then this switch will no longer be needed and we can remove the
// walkDestroy operation mode.
// TODO: Audit that and remove walkDestroy as an operation mode.
operation = walkDestroy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's pragmatic to do this right now as an interim fix to make this stuff all work, but I note that this is in direct contradiction of the doc comment on Plan.UIMode and so a potential pitfall for future maintainers.

Should we add a similar comment to the docs for Plan.UIMode, being explicit that we're currently not actually obeying the rules for that field but that we aspire to in future? My goal would be to make sure someone reading those docs can learn both that they shouldn't use it for any new behavior-affecting stuff and that they need to be careful about changing it right now because there is legacy code violating the rule.

Copy link
Member Author

@jbardin jbardin Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block also got moved so the - is not right there, but I sort of adopted the prior comment but tried to make it more clear it wasn't as simple as just auditing out the existing references to the operation. I'll add something similar to Plan.UIMode to make sure it's better covered.

@github-actions
Copy link

Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch.

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
1.3-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged
Projects
None yet
2 participants