second destroy (after successful first) returns 'The "count" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created.' #32126

jaffel-lc · 2022-10-31T19:23:45Z

Terraform Version

Terraform v1.3.3                                                                                                                                                                                           on linux_amd64                                                                                                                                                                                             + provider registry.terraform.io/hashicorp/aws v4.34.0

Terraform Configuration Files

variable "external_acm" {
  type        = bool
  default     = false
}

variable "lb_certificate_arn" {
  type        = string
  default     = ""
  description = "If running HTTPS then a valid certificate arn must be provided."
}

variable "route53_zone_id" {
  type        = string
  default     = null
  description = "Route53 Zone ID for the domain served by this load balancer"
}

resource "aws_acm_certificate_validation" "certval" {
  count                   = (var.external_acm || var.lb_certificate_arn != "" || var.route53_zone_id == null) ? 0 : 1
  certificate_arn         = aws_acm_certificate.cert[0].arn
  validation_record_fqdns = [aws_route53_record.cert_validation[0].fqdn]

  lifecycle {
    create_before_destroy = true
  }
}

Debug Output

https://gist.github.com/jaffel-lc/78be590f8fddae0426adbffba844f374

Expected Behavior

second destroy should complete without error (just like the first) and without destroying anything.

Actual Behavior

Do you really want to destroy all resources? 
Terraform will destroy all your managed infrastructure, as shown above. 
There is no undo. Only 'yes' will be accepted to confirm. 
Enter a value: yes
╷                                                                                                                                                                                                          │ Error: Invalid count argument
│
on ../certificate.tf line 30, in resource "aws_acm_certificate_validation" "certval":
│   30:   count                   = (var.external_acm || var.lb_certificate_arn != "" || var.route53_zone_id == null) ? 0 : 1 
│
│ The "count" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how 
any instances will be created. To work around this, use the -target argument to
│ first apply only the resources that the count depends on.                                                                                                                                                ╵

Steps to Reproduce

terraform init
terraform apply
terraform destroy
terraform destroy

Additional Context

No response

References

No response

The text was updated successfully, but these errors were encountered:

jaffel-lc · 2022-10-31T19:29:35Z

While this appears to be a harmless error, the erroneously failing tf destroy command's exit status is 1 which makes our build/tests pipeline report failure form its last - global cleanup - step, despite having successfully destroyed all of the resources in the previous step.

apparentlymart · 2022-10-31T23:20:44Z

Hi @jaffel-lc! Thanks for reporting this.

I think what's going on here is that Terraform creates a normal plan as a precursor to creating a destroy plan because a normal plan refreshes the previous run state and can therefore detect if something has already been destroyed which therefore doesn't need to be "re-destroyed". However, a normal plan also needs to expand all resource blocks that have repetition arguments, and so it can run into this problem if the repetition depends on something that hasn't been created yet, in this case because you've literally just destroyed it.

If I'm right about the cause then the good news is that we changed the approach to that in #32051 for an unrelated reason, and so as of the next release Terraform will internally use a refresh-only plan for that initial refreshing step. Terraform didn't behave this way before just because the destroy feature has been around longer than the possibility of refresh-only plans, and so it was implemented in terms of the primitives that were available at the time.

That change was backported into the v1.3 branch and so will be included in the forthcoming v1.3.4 release. Once that's out (which should be in the next week or so), could you give that a try and see if the problem still occurs? Thanks!

jaffel-lc · 2022-11-01T14:16:56Z

That is great news.
I'll test again once I notice 1.3.4

jaffel-lc · 2022-11-08T13:22:23Z

Tested.

Now I am getting several "invalid index" errors.

Here is one example:

│ Error: Invalid index                                                                                                                                                                 
│                                                                                                                                                                                       
│   on .terraform/modules/t3_task1/main.tf line 5, in locals:
│    5:   log_group_name = var.log_group_name != "" ? var.log_group_name : aws_cloudwatch_log_group.task-log-group[0].name
│      ├────────────────
│      │ aws_cloudwatch_log_group.task-log-group is empty tuple
│
│ The given key does not identify an element in this collection value: the collection has no elements.

[updated error formatting for readability]

apparentlymart · 2022-11-08T16:45:17Z

Thanks @jaffel-lc!

It seems that Terraform is noticing that there are now no instances of aws_cloudwatch_log_group.task-log-group and so is rejecting this access of element zero as invalid.

I would agree that this doesn't seem right, but I'm also not really sure what Terraform ought to do instead here. It is true that there is not an index zero in the state, but there is still presumably an index zero declared in the configuration. It's weird to evaluate something in the configuration against the current state rather than the desired state, but in this context Terraform isn't actually building a desired state and so it can't refer to that.

With that said then: it does seem like there's an opportunity to improve this case, but it's not clear to me exactly what change is valid to make here while allowing the refresh phase prior to destroy still work. We'll need to think more about that before deciding how to proceed here.

In the meantime I think unfortunately the most viable strategy with today's Terraform is to somehow avoid running terraform destroy a second time. One possible answer to that would be to run terraform show -json and count how many resource instances are listed in the resulting JSON representation of the state; if you find none then you know that everything has already been destroyed and can skip running terraform destroy again.

jaffel-lc · 2022-11-08T19:51:29Z

With that said then: it does seem like there's an opportunity to improve this case, but it's not clear to me exactly what
change is valid to make here while allowing the refresh phase prior to destroy still work. We'll need to think more about
that before deciding how to proceed here.

One thought is that an error caused by outputs could have a different exit value than one cause by e.g. syntax error, and then our automations could choose to treat that specific exit code as not-an-error.

Another is that TF could not bother to lookup an output value if its originating resource is about to be destroyed, or does not exist, since the output will be cleared.

We have a cleanup task that runs a second destroy as a workaround failures we have encountered when ASGs, ECS services, and/or ecsclusters do not complete their destroys before terraform times out waiting for them to complete. The second destroy usually clears the reseources, or manages to resync the state file.

jaffel-lc · 2022-11-08T21:28:19Z

But it occurs to me now that I could wrap the resource[0].name reference in a try() call.

jbardin · 2022-11-10T02:01:28Z

I'm going to take a look into this because it's very similar to some related destroy time errors. The problem generally arises while attempting to refresh the instances, which during destroy is primarily used to ensure providers have the most recent values for their configuration, and remove any instances which may have already been deleted. In the meantime, a better workaround maybe to use -refresh=false to skip this process, which is probably not very useful to begin with if the destroy operations are run back-to-back.

MattJeanes · 2022-11-10T03:26:45Z

This has also hit us too on 1.3.4, downgrading to 1.2.9 appears to work for now, fingers crossed for a proper fix soon! 😄

Interestingly I found that downgrading to 1.3.2 initially appeared to fix the issue for one case I was seeing but then it failed in other cases whereas 1.2.9 seems to work properly.

sbocinec · 2022-11-14T20:32:35Z

Now I am getting several "invalid index" errors.
The given key does not identify an element in this collection value: the collection has no elements.

Spent half of my day today investigating our integration test suite failing with the following error after upgrading from 1.2.9 to 1.3.4:

The given key does not identify an element in this collection value: the collection has no elements.
Error: Invalid index
  on main.tf line 49, in locals:
  49:     ? module.network[0].vpc_id
    â”œâ”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
    â”‚ module.network is empty tuple
The given key does not identify an element in this collection value: the collection has no elements.

trying to hunt down the issue and now I see the culprit is indeed in the destroy fixes in the 1.3.x causing this nasty bugs.. I hope this is going to be fixed soon 🤞

RichardGTsl · 2022-11-16T13:53:33Z

Hi,
I've just run into an "empty tuple" problem as well. In my case, it prevents a second destroy from cleaning up if the first destroy left some resources behind.

In my case the resource that triggers the fault is https://github.com/terraform-aws-modules/terraform-aws-sqs/blob/cf30bb3498d39969590e4d47bbce56b02f1dc9a5/main.tf#L30. This is the line that fails:
arn = aws_sqs_queue.this[0].arn
I couldn't find a way to work round this[0] being undefined.

This is only broken in 1.3.4 which suggests that there's a fundamental change in behavior between 1.3.3 and 1.3.4. The other versions I tested were 1.3.3, 1.3.2, 1.3.0, 1.2.9 and 1.2.1.

cheers,
Richard

sbocinec · 2022-11-16T14:16:07Z

In my case, even the first destroy fails, if the root modules uses a child module that uses a data source and terraform destroy with -refresh=false is executed.

In our use-case we always destroy without refreshing to avoid data source failures and using the 1.3.4 version fails consistently, always, even on first destroy with this error.

jbardin · 2022-11-16T14:29:53Z

@sbocinec, Thanks, that would be a different issue, and is unlikely to be affected by any fix to this one. The current problem happens during the pre-destroy refresh which is skipped with -refresh=false. If you have an example you could post in a new issue, it would be helpful.

timblaktu · 2022-11-17T14:22:51Z

Happy to join your ranks fellows. Like @sbocinec I've been troubleshooting this issue for days now. I was first focusing on the changes I had made in forks of the modules I'm using from my root module (eks_blueprints, terraform-aws-eks, eks_blueprints_kubernetes_addons). After I fixed a few unrelated bugs, I started seeing the 'the collection has no elements' error pattern at destroy-time, which i fixed by applying this pattern to each case that popped up, whack-a-mole style:

  # Harden all references to variable-length list resources using splat and coalesce()
  # to prevent "the collection has no elements" errors.
  # node_security_group_id = local.create_node_sg ?              aws_security_group.node[0].id :           var.node_security_group_id
  node_security_group_id =   local.create_node_sg ? coalescelist(aws_security_group.node[*].id, [""])[0] : var.node_security_group_id
  #                                                                                      ^          ^
  #                                                           splat list yields null if  |          |
  #                                                           there are no elements -----           |
  #                                                                                                 |
  #                                                 coalescelist(any_list, [""])` will always       |
  #                                                 return a list with at least one element ---------

After "hardening" all the direct references to the first element of variable-length resource lists in my modules, I then started seeing the same pattern in the outer modules I hadn't made any changes to.

Finally convinced "It's you, not me" I was able to find this issue. Happy to be here. :-)

What can I do to help?

I've inferred from the comments of @apparentlymart and @jbardin that this "variable-length list hardening" should be unnecessary, and that this misbehavior is caused by the new "destroy-time refresh-only plan" feature in 1.3.4.

For now I will try @jbardin's recommendation of passing -refresh=false arg to all terraform destroy calls.

Nota Bene

Terraform Destroy Sequence

In my case, and many other people I've seen hitting this issue who are using eks_blueprints module, there is not a single terraform destroy call. A sequence of targeted terraform destroy calls, followed by a final non-targeted terraform destroy call, is required to "stage" the top-down destruction of the layers of infrastructure (and applications) that terraform is managing. This is recommended by the eks_blueprints project, but I can endorse this approach when managing k8s clusters using terraform. There are numerous race conditions that can occur if you do not stage things like this. Tools like terragrunt and scripts orchestrated by gnu make this easier.

Same Misbehavior Reported in EKS Blueprints

This failure mode is being tracked here for eks_blueprints.

Same Misbehavior Reported in terraform-aws-eks

Here is a closed 3 year old issue in terraform-aws-eks which I thought the core devs/contributors @apparentlymart @jbardin may find interesting, since it's the same misbehavior. The changeset that resolved that issue is where I got the idea for my "variable-length list hardening" patch above.

Module Forks/Branches Used to Test my Patches

My root module currently uses this fork/branch of eks_blueprints which originally was created to add support for crossplane helm and terraform providers. I had validated all it's functionality and was about to submit a PR when I started noticing "collection has no elements" errors. (I upgraded to 1.3.4 somewhere in that process.) My eks_blueprints fork crossplane-helm-provider branch uses my terraform-aws-eks fork 568-redux branch, which also contains these hardening patches.

timblaktu · 2022-11-17T16:00:10Z

@jbardin in my first test using -refresh=false for every terraform destroy call prevents the "collection has no elements" errors I was seeing just prior using same code without this arg. :-)

However :-( there remains a problem with not refreshing during destroy: after successfully completing my destroy sequence, data sources are not destroyed and remain in the state. I've tried to work around this by appending YADS (Yet Another Destroy Stage) to the end of my sequence to run a normal, non-targeted, with-refresh, terraform destroy. However, these data sources were still not destroyed:

main  | 2022-11-17T15:54:05.033699600Z Terraform state for workspace pr-tim-usw1 now contains:
main  | 2022-11-17T15:54:05.033703400Z
main  | 2022-11-17T15:54:05.033706000Z     data.aws_availability_zones.available
main  | 2022-11-17T15:54:05.033708800Z     data.aws_caller_identity.current
main  | 2022-11-17T15:54:05.033711300Z     data.aws_iam_policy_document.managed_ng_assume_role_policy
main  | 2022-11-17T15:54:05.033714100Z     data.aws_region.current
main  | 2022-11-17T15:54:05.033716900Z     data.aws_secretsmanager_secret_version.cluster_unsealed["cluster-unsealed-argocd2022-gitlab-repo-cred"]                                                                                                                                                                                          main  | 2022-11-17T15:54:05.033719900Z     data.aws_secretsmanager_secrets.cluster_unsealed                                                                                                                                                                                                                                                                     main  | 2022-11-17T15:54:05.033722400Z     data.kubectl_path_documents.sealed_secrets
main  | 2022-11-17T15:54:05.033724800Z     module.eks_blueprints.data.aws_caller_identity.current
main  | 2022-11-17T15:54:05.033727200Z     module.eks_blueprints.data.aws_iam_policy_document.eks_key
main  | 2022-11-17T15:54:05.033729600Z     module.eks_blueprints.data.aws_iam_session_context.current
main  | 2022-11-17T15:54:05.033732100Z     module.eks_blueprints.data.aws_partition.current
main  | 2022-11-17T15:54:05.033742900Z     module.eks_blueprints.data.aws_region.current
main  | 2022-11-17T15:54:05.033745900Z     module.eks_blueprints_kubernetes_addons.data.aws_caller_identity.current                                                                                                                                                                                                                                                                 main  | 2022-11-17T15:54:05.033748500Z     module.eks_blueprints_kubernetes_addons.data.aws_partition.current                                                                                                                                                                                                                                                                       main  | 2022-11-17T15:54:05.033750800Z     module.eks_blueprints_kubernetes_addons.data.aws_region.current
main  | 2022-11-17T15:54:05.033753300Z     module.eks_blueprints.module.aws_eks.data.aws_caller_identity.current                                                                                                                                                                                                                                                                    main  | 2022-11-17T15:54:05.033755700Z     module.eks_blueprints.module.aws_eks.data.aws_default_tags.current                                                                                                                                                                                                                                                                       main  | 2022-11-17T15:54:05.033758200Z     module.eks_blueprints.module.aws_eks.data.aws_iam_policy_document.assume_role_policy[0]                                                                                                                                                                                                                                                  main  | 2022-11-17T15:54:05.033760700Z     module.eks_blueprints.module.aws_eks.data.aws_partition.current
main  | 2022-11-17T15:54:05.033763200Z     module.eks_blueprints_kubernetes_addons.module.crossplane[0].data.aws_iam_policy_document.s3_policy                                                                                                                                                                                                                                      main  | 2022-11-17T15:54:05.033766000Z     module.eks_blueprints.module.aws_eks.module.kms.data.aws_caller_identity.current                                                                                                                                                                                                                                                         main  | 2022-11-17T15:54:05.033768700Z     module.eks_blueprints.module.aws_eks.module.kms.data.aws_partition.current

The complete destroy sequence I'm using is:

echo "Commencing destroy sequence, using -refresh=false to work around https://github.com/hashicorp/terraform/issues/32126..."
echo "Destroying eks_blueprints_kubernetes_addons..."
time terraform destroy -refresh=false -target="module.eks_blueprints_kubernetes_addons" -input=false -auto-approve -compact-warnings ${TF_EXTRA_ARGS:+"${TF_EXTRA_ARGS}"}
echo "Destroying eks_blueprints..."
time terraform destroy -refresh=false -target="module.eks_blueprints" -input=false -auto-approve -compact-warnings ${TF_EXTRA_ARGS:+"${TF_EXTRA_ARGS}"}
echo "Clean up remaining resources in terraform state with a non-targeted destroy..."
echo "    ${RED}Note: because we're using -refresh=false to work around destroy-time misbehaviors, data sources will not get cleaned up here${RESET}"
time terraform destroy -refresh=false -input=false -auto-approve -compact-warnings ${TF_EXTRA_ARGS:+"${TF_EXTRA_ARGS}"}
echo "Finally, clean up any remaining objects in terraform state (should be only data sources at this point) with a non-targeted destroy WITH -refresh=true..."
time terraform destroy -input=false -auto-approve -compact-warnings ${TF_EXTRA_ARGS:+"${TF_EXTRA_ARGS}"}

github-actions · 2022-12-18T02:11:53Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

jaffel-lc added bug new new issue not yet triaged labels Oct 31, 2022

jaffel-lc closed this as completed Nov 1, 2022

jaffel-lc reopened this Nov 8, 2022

jbardin self-assigned this Nov 10, 2022

jbardin linked a pull request Nov 16, 2022 that will close this issue

Make the pre-destroy refresh a full plan #32208

Merged

jbardin mentioned this issue Nov 16, 2022

Make the pre-destroy refresh a full plan #32208

Merged

timblaktu mentioned this issue Nov 17, 2022

terraform destroy Invalid index issue aws-ia/terraform-aws-eks-blueprints#1161

Closed

jbardin closed this as completed in #32208 Nov 17, 2022

teamterraform mentioned this issue Nov 17, 2022

Backport of Make the pre-destroy refresh a full plan into v1.3 #32237

Merged

github-actions bot locked as resolved and limited conversation to collaborators Dec 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

second destroy (after successful first) returns 'The "count" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created.' #32126

second destroy (after successful first) returns 'The "count" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created.' #32126

jaffel-lc commented Oct 31, 2022

jaffel-lc commented Oct 31, 2022

apparentlymart commented Oct 31, 2022

jaffel-lc commented Nov 1, 2022

jaffel-lc commented Nov 8, 2022 •

edited by apparentlymart

apparentlymart commented Nov 8, 2022

jaffel-lc commented Nov 8, 2022

jaffel-lc commented Nov 8, 2022

jbardin commented Nov 10, 2022 •

edited

MattJeanes commented Nov 10, 2022 •

edited

sbocinec commented Nov 14, 2022

RichardGTsl commented Nov 16, 2022

sbocinec commented Nov 16, 2022

jbardin commented Nov 16, 2022

timblaktu commented Nov 17, 2022 •

edited

timblaktu commented Nov 17, 2022 •

edited

github-actions bot commented Dec 18, 2022

second destroy (after successful first) returns 'The "count" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created.' #32126

second destroy (after successful first) returns 'The "count" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created.' #32126

Comments

jaffel-lc commented Oct 31, 2022

Terraform Version

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Additional Context

References

jaffel-lc commented Oct 31, 2022

apparentlymart commented Oct 31, 2022

jaffel-lc commented Nov 1, 2022

jaffel-lc commented Nov 8, 2022 • edited by apparentlymart

apparentlymart commented Nov 8, 2022

jaffel-lc commented Nov 8, 2022

jaffel-lc commented Nov 8, 2022

jbardin commented Nov 10, 2022 • edited

MattJeanes commented Nov 10, 2022 • edited

sbocinec commented Nov 14, 2022

RichardGTsl commented Nov 16, 2022

sbocinec commented Nov 16, 2022

jbardin commented Nov 16, 2022

timblaktu commented Nov 17, 2022 • edited

Nota Bene

Terraform Destroy Sequence

Same Misbehavior Reported in EKS Blueprints

Same Misbehavior Reported in terraform-aws-eks

Module Forks/Branches Used to Test my Patches

timblaktu commented Nov 17, 2022 • edited

github-actions bot commented Dec 18, 2022

jaffel-lc commented Nov 8, 2022 •

edited by apparentlymart

jbardin commented Nov 10, 2022 •

edited

MattJeanes commented Nov 10, 2022 •

edited

timblaktu commented Nov 17, 2022 •

edited

timblaktu commented Nov 17, 2022 •

edited