Extra feedback for a variety of "move" consequences and edge-cases #29637

apparentlymart · 2021-09-23T19:04:51Z

Although in most cases Terraform's handling of the new moved blocks should be straightforward and intuitive, there are inevitably some less common cases where the behavior may be surprising enough to warrant further explanation.

This set of changes aims to add a first round of explanations of that sort, focused on the following situations:

Colliding moves, where two objects try to move to the same address in spite of our basic static validation rules.

This can typically apply only when someone's previously used terraform state mv or terraform apply -target=... to create a situation that wouldn't be reachable through the standard workflow.

However, there is at least one situation where it's reachable even in the standard workflow: if we need to resolve both module.foo moves to module.bar and aws_instance.a moves to module.bar.aws_instance.a then both of those moves would be considered structurally valid, but if our prior state already had an object at module.foo.aws_instance.a then the two aws_instance.a objects would conflict with one another to try to occupy the address module.bar.aws_instance.a.

This is not a situation we expect folks to reach regularly, and so one of the changes here adds an additional warning message to the plan result in that unusual case, which is intended to help explain why Terraform is planning to destroy an object that the user might have expected to be moved to a new address which is still declared in configuration. We don't have enough information to derive a good sense of user intent in that situation, and so the warning just makes a general observation that you can use terraform state subcommands to make things less ambiguous if you're not satisfied with the resolution Terraform proposed.
Moving to an address that isn't declared in the configuration.

The main expected use of moved blocks is to set the to argument to refer to an address which has a corresponding declaration in the configuration. In that case, the happy path is for Terraform to simply pretend that an existing object had originally been created at the new address and not propose any other changes at all.

However, because moved blocks are a historical record that will (in most cases) remain in modules indefinitely, we do need to allow for a later version of the module possibly removing the declaration of that object. Our rules for moves are that we should always generate the same effect for two actions handled together in the same plan as we would for those actions separated into two runs, and so therefore we must tolerate moving into an undeclared address and plan to delete any object that moved into that undeclared address.

An unfortunate consequence of this rule is that if an author makes a mistake when writing the to address then Terraform will typically propose to destroy the object. As a compromise then, the changes here add some additional context to a planned destroy to help draw attention to the mismatch between configuration and state.

Although the motivation to do this now was to give better feedback about incorrect moved statements, adding this additional feedback for planned destroy has been on my wishlist for a while now, and so I implemented it in a generic way which works even in the normal case where there's no moving going on. I hope that this will be useful to help users disambiguate the several different reasons that can cause Terraform to declare a particular resource instance as an "orphan" (no longer declared in configuration) and thus figure out what they need to change in the configuration if they don't want to delete the object.

While I was in the area anyway, I also made some minor adjustments to how we present the extra note about a resource instance having moved, which we'd planned to do in an earlier PR but accidentally overlooked it.

The following example output shows a rather gnarly combination of these situations all together, just for illustrative purposes. In practice I would expect it to be extremely rare to see all of these appear together in the same proposed plan. (I created the conflict between null_resource.a[0] and null_resource.a through some malicious use of terraform state mv prior to requesting this plan.)

Terraform will perform the following actions:

  # null_resource.a[0] will be destroyed
  # (because resource does not use count)
  - resource "null_resource" "a" {
      - id = "6362033012182345903" -> null
    }

  # null_resource.d will be destroyed
  # (because null_resource.d is not in configuration)
  # (moved from null_resource.c)
  - resource "null_resource" "d" {
      - id = "7027155152011416384" -> null
    }

Plan: 0 to add, 0 to change, 2 to destroy.
╷
│ Warning: Unresolved resource instance address changes
│ 
│ Terraform tried to adjust resource instance addresses in the prior
│ state based on change information recorded in the configuration,
│ but some adjustments did not succeed due to existing objects
│ already at the intended addresses:
│   - null_resource.a[0] could not move to null_resource.a
│ 
│ Terraform has planned to destroy these objects. If Terraform's
│ proposed changes aren't appropriate, you must first resolve the
│ conflicts using the "terraform state" subcommands and then create a
│ new plan.
╵

Making the implicit zero-to-none and none-to-zero key moves more visible in the UI here drew attention to a quirk of my earlier work in #29605, in the situation of moving from for_each to count rather than for_each to no repetition. I also fixed that while here, because otherwise the reason given for proposing to delete the old object didn't actually make sense.

Our previous rule for implicitly moving from IntKey(0) to NoKey would apply that move even when the current resource configuration uses for_each, because we were only considering whether "count" were set. Previously this was relatively harmless because the resource instance in question would end up planned for deletion anyway: neither an IntKey nor a NoKey are valid keys for for_each. Now that we're going to be announcing these moves explicitly in the UI, it would be confusing to see Terraform report that IntKey moved to NoKey in a situation where the config changed from count to for_each, so to address that we'll only generate the implied statement if neither repetition argument is set.

There are a few different reasons why a resource instance tracked in the prior state might be considered an "orphan", but previously we reported them all identically in the planned changes. In order to help users understand the reason for a surprising planned delete, we'll now try to specify an additional reason for the planned deletion, covering all of the main reasons why that could happen. This commit only introduces the new detail to the plans.Changes result, though it also incidentally exposes it as part of the JSON plan result in order to keep that working without returning errors in these new cases. We'll expose this information in the human-oriented UI output in a subsequent commit.

The core runtime is now able to specify a reason for some situations when Terraform plans to delete a resource instance. This commit makes that information visible in the human-oriented UI. A previous commit already made the underlying data informing these new hints visible as part of the machine-oriented (JSON) plan output. This also removes the bold formatting from the existing "has moved to" hints, because subjectively it seemed like the result was emphasizing too many parts of the output and thus somewhat defeating the benefit of the emphasis in trying to create additional visual hierarchy for sighted users running Terraform in a terminal. Now only the first line containing the main action statement will be in bold, and all of the parenthesized follow-up notes will be unformatted.

In most cases Terraform will be able to automatically fully resolve all of the pending move statements before creating a plan, but there are some edge cases where we can end up wanting to move one object to a location where another object is already declared. One relatively-obvious example is if someone uses "terraform state mv" in order to create a set of resource instance bindings that could never have arising in normal Terraform use. A less obvious example arises from the interactions between moves at different levels of granularity. If we are both moving a module to a new address and moving a resource into an instance of the new module at the same time, the old module might well have already had a resource of the same name and so the resource move will be unresolvable. In these situations Terraform will move the objects as far as possible, but because it's never valid for a move "from" address to still be declared in the configuration Terraform will inevitably always plan to destroy the objects that didn't find a final home. To give some additional explanation for that result, here we'll add a warning which describes what happened. This is not a particularly actionable warning because we don't really have enough information to guess what the user intended, but we do at least prompt that they might be able to use the "terraform state" family of subcommands to repair the ambiguous situation before planning, if they want a different result than what Terraform proposed.

Because our validation rules depend on some dynamic results produced by actually running the plan, we deal with moves in a "backwards" order where we try to apply them first -- ignoring anything strange we might find -- and then validate the original statements only after planning. An unfortunate consequence of that approach is that when the move statements are invalid it's likely that move execution will not fully complete, and so the generated plan is likely to be incorrect and might well include errors resulting from the unresolved moves. To mitigate that, here we let any move validation errors supersede all other diagnostics that the plan phase might've generated, in the hope that it'll help the user focus on fixing the incorrect move statements without creating confusing by reporting errors that only appeared as a quick of how Terraform worked around the invalid move statements earlier.

alisdair · 2021-09-24T15:06:22Z

internal/plans/changes.go

+	// arguments. This difference in detail is out of pragmatism, because
+	// potentially multiple nested modules could all contribute conflicting
+	// specific reasons for a particular instance to no longer be declared.
+	ResourceInstanceDeleteBecauseNoModule ResourceInstanceChangeActionReason = 'M'


When adding new reasons here, I think we should also update the streaming JSON UI output to match:

terraform/internal/command/views/json/change.go

Lines 70 to 94 in c8cd0b1

const (

ReasonNone ChangeReason = ""

ReasonTainted ChangeReason = "tainted"

ReasonRequested ChangeReason = "requested"

ReasonCannotUpdate ChangeReason = "cannot_update"

ReasonUnknown ChangeReason = "unknown"

)

func changeReason(reason plans.ResourceInstanceChangeActionReason) ChangeReason {

switch reason {

case plans.ResourceInstanceChangeNoReason:

return ReasonNone

case plans.ResourceInstanceReplaceBecauseTainted:

return ReasonTainted

case plans.ResourceInstanceReplaceByRequest:

return ReasonRequested

case plans.ResourceInstanceReplaceBecauseCannotUpdate:

return ReasonCannotUpdate

default:

// This should never happen, but there's no good way to guarantee

// exhaustive handling of the enum, so a generic fall back is better

// than a misleading result or a panic

return ReasonUnknown

}

}

I'll submit a PR for these changes shortly. Can you think of any automated way to keep these two enums in sync for future changes? Maybe adding nishanths/exhaustive as a CI check?

github-actions · 2021-10-25T02:07:52Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apparentlymart added enhancement cli labels Sep 23, 2021

apparentlymart requested a review from a team September 23, 2021 19:04

apparentlymart self-assigned this Sep 23, 2021

apparentlymart added 4 commits September 23, 2021 12:13

apparentlymart force-pushed the f-moved-unusual-feedback branch from 3c6c5c1 to f371f46 Compare September 23, 2021 19:13

jbardin approved these changes Sep 23, 2021

View reviewed changes

apparentlymart merged commit d97ef10 into main Sep 23, 2021

apparentlymart deleted the f-moved-unusual-feedback branch September 23, 2021 21:37

alisdair reviewed Sep 24, 2021

View reviewed changes

github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2021

laurapacilio added the documentation label Nov 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extra feedback for a variety of "move" consequences and edge-cases #29637

Extra feedback for a variety of "move" consequences and edge-cases #29637

apparentlymart commented Sep 23, 2021

alisdair Sep 24, 2021

alisdair Sep 24, 2021

github-actions bot commented Oct 25, 2021

	const (
	ReasonNone ChangeReason = ""
	ReasonTainted ChangeReason = "tainted"
	ReasonRequested ChangeReason = "requested"
	ReasonCannotUpdate ChangeReason = "cannot_update"
	ReasonUnknown ChangeReason = "unknown"
	)

	func changeReason(reason plans.ResourceInstanceChangeActionReason) ChangeReason {
	switch reason {
	case plans.ResourceInstanceChangeNoReason:
	return ReasonNone
	case plans.ResourceInstanceReplaceBecauseTainted:
	return ReasonTainted
	case plans.ResourceInstanceReplaceByRequest:
	return ReasonRequested
	case plans.ResourceInstanceReplaceBecauseCannotUpdate:
	return ReasonCannotUpdate
	default:
	// This should never happen, but there's no good way to guarantee
	// exhaustive handling of the enum, so a generic fall back is better
	// than a misleading result or a panic
	return ReasonUnknown
	}
	}

Extra feedback for a variety of "move" consequences and edge-cases #29637

Extra feedback for a variety of "move" consequences and edge-cases #29637

Conversation

apparentlymart commented Sep 23, 2021

alisdair Sep 24, 2021

Choose a reason for hiding this comment

alisdair Sep 24, 2021

Choose a reason for hiding this comment

github-actions bot commented Oct 25, 2021