Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using if meta attribute instead of count for single-instance conditional resource #30221

Closed
sandipndev opened this issue Dec 20, 2021 · 10 comments
Labels
config duplicate issue closed because another issue already tracks this problem enhancement

Comments

@sandipndev
Copy link

The Problem

When we search for "Conditional Resource creation in Terraform", we always find the answer:

count = condition ? 1 : 0

While this seems simple enough, I have faced issues numerous times with this syntax. Whenever we have to make a given resource optional, which was already deployed without using count, we always have to terraform state mv to migrate the state to index 0 of this resource, now modified using count.

Moreover, if we want to deploy ONE single instance of a resource, it doesn't really make sense to have a resource[0] on the state. It might make someone analysing the state think that the resource should or might have multiple copies while in reality, the developer just wanted to deploy the resource conditionally.

Proposal

I think the following syntax would be pretty nice to have:

resource "type" "name" {
  if = condition
}

This should be implemented in such a manner that if the if clause returns truthy, only then should the resource be created, conforming to count and all other meta args. By default, without count, the resource should not be indexed and the if clause will just serve as a boolean gate for whether or not the resource will be created.

I understand that many resources could have used "if" as an argument to the resource definition itself, but this could be solved by placing the if conditional inside the lifecycle hook, probably?

I have felt that count = conditional ? 1 : 0 was being used in many places where a simple if = condition would do. Just wanted to have a discussion on this. Also, is there already a solution to this problem and I'm missing out?

What are your opinions?

@sandipndev sandipndev added enhancement new new issue not yet triaged labels Dec 20, 2021
@sandipndev sandipndev changed the title Using if meta attribute instead of count Using if meta attribute instead of count for single-instance conditional resource Dec 20, 2021
@apparentlymart
Copy link
Member

apparentlymart commented Dec 21, 2021

Hi @sandipndev,

I think there are other issues where this was discussed in the past, but it doesn't hurt to revisit and see if all of the previous assumptions/conclusions still hold.

First, I want to note that if Terraform is behaving correctly then you should not need to use terraform state mv to add a count argument to an existing resource, because Terraform has long-standing logic to detect that situation and automatically move the no-key instance to the zero-key instance. Terraform v1.1 has formalized that special case as an "implied moved block", behaving as if you wrote out a moved block to move from the no-key instance to the zero-key instance. If you don't see that happening in Terraform v1.1 then that's a bug which we'd like to investigate in a separate bug report issue. (What I've described here is only for resource count though; adding count to a module block does still require an explicit moved block, but the new v1.1 feature allows doing it within the configuration rather than requiring a separate step.)


However, it seems like your stronger concern here is the need to refer to index zero when using the resource elsewhere in the module.

An important part of declaring a conditional number of instances of a resource is explaining to Terraform how to handle different numbers of instances elsewhere in your module. For example, consider:

resource "aws_instance" "example" {
  count = condition ? 1 : 0
}

It's only valid to refer to aws_instance.example[0] if count = 1, so any expression you write elsewhere in the module must somehow handle the case where count = 0, and therefore aws_instance.example is an empty list. The Terraform language already has various features for dealing with possibly-empty lists, such as the splat operator [*] for concisely accessing an attribute of each element, with aws_instance.example[*].id producing an empty list if there are no instances of aws_instance.example.

resource "aws_instance" "example" {
  if = condition
}

With all of that said then, it's been a common theme in proposals of this sort to propose a syntax for declaring the condition (an if argument, in your case) without also explaining what expressions elsewhere in the module referring to such a resource would look like, while properly handling both the if = true and if = false cases. That is actually the more complicated part of designing this feature, because references to resources can appear in various different contexts that must all have some answer for how to deal with that situation.

In previous discussions some have suggested making aws_instance.example be null if the condition is false. That means that an expression like aws_instance.example.id would fail in that case, because a null value has no attributes. The splat operator [*] is defined in such a way that it could potentially help here, because applying it to a non-list value activates the Single Values as Lists mode: aws_instance.example[*].id would be valid in both the true and false cases, but now we're back to a zero-or-one element list again, and so we haven't really changed anything about how that would be used elsewhere in the module.

If you have any other thoughts about what aws_instance.example.id might return in the disabled case (other than null) and can show how that result would be used elsewhere in the module, I'd love to consider some alternatives here.


By the way, in the meantime if you want to make it clearer to future maintainers of your module that you intend there to be only zero or one of something, you can use the one function to represent that. It serves as a dynamic assertion that a list has only zero or one elements, generating an error if it has two or more, and returns the value of element zero if there is one.

You do still need to figure out how to contend with the resulting null result in the empty list case, which presents similar challenges to what I described above, but if you are writing an expression in a context where a null would be valid then that can be relatively concise, such as:

  instance_id = one(aws_instance.example[*].id)

@apparentlymart apparentlymart added config and removed new new issue not yet triaged labels Dec 21, 2021
@tomharrisonjr
Copy link

I wrote a broader blog post on this topic, and have gotten a lot of positive feedback. I hope you'll consider how those of us who are daily TF developers and module writers have to interact with the software.

@kalinon
Copy link

kalinon commented Jan 26, 2022

much needed feature

@twbecker
Copy link

I'm glad to find this, I was considering filing this exact enhancement request. While I understand and agree with the desire to have a better way to refer to these types of "optional" resources, I think we shouldn't let a lack of good way to improve that stop us from improving how these resources are declared. Using count for this purpose feels like abusing the system; if or enabled, etc not only express the author's intent better but also preclude the need for the ? 1 : 0 ternary. That is a real improvement in my view even if the resources still had to be referred to with the list syntax from other places.

@jdelforno
Copy link

Hi @sandipndev,

I think there are other issues where this was discussed in the past, but it doesn't hurt to revisit and see if all of the previous assumptions/conclusions still hold.

With all of that said then, it's been a common theme in proposals of this sort to propose a syntax for declaring the condition (an if argument, in your case) without also explaining what expressions elsewhere in the module referring to such a resource would look like, while properly handling both the if = true and if = false cases. That is actually the more complicated part of designing this feature, because references to resources can appear in various different contexts that must all have some answer for how to deal with that situation.

In previous discussions some have suggested making aws_instance.example be null if the condition is false. That means that an expression like aws_instance.example.id would fail in that case, because a null value has no attributes. The splat operator [*] is defined in such a way that it could potentially help here, because applying it to a non-list value activates the Single Values as Lists mode: aws_instance.example[*].id would be valid in both the true and false cases, but now we're back to a zero-or-one element list again, and so we haven't really changed anything about how that would be used elsewhere in the module.

If you have any other thoughts about what aws_instance.example.id might return in the disabled case (other than null) and can show how that result would be used elsewhere in the module, I'd love to consider some alternatives here.

Would a possible way forward simply be:
enable = var.resourceEnabled

And then make use of depends_on, for resources that have no direct dependencies? If the child resource tries to create, it checks the parent, see's the flag set, issues a warning that it will not create as the parent is not enabled and then does nothing.

@chrs-myrs
Copy link

Maybe I'm pushing TF in a direction it's not designed for, but it's hard to communicate how much frustration the lack of this feature causes when trying to create consistency across development stages whilst also having some infra modules not deployed to some environments. Especially since if you try to retrospectively introduce a count meta-argument, since terraform often then wants to recreate all those resources (I realise it tries to move them, but this doesn't always work). It's my top WTF item since delving into devops. I understand where the language developed from and why this doesn't exist, but it's now impossible to create consistent but differing environments with anything close to DRY code.

For me I only care about pre-determined feature flags (e.g. set in a .tfvars file), the if condition would never be calculated based on the output of other resources, perhaps that constraint would simplify the implementation?

@chrs-myrs
Copy link

Probably duplicate of #21953

@rkennedy-ki
Copy link

Maybe I'm pushing TF in a direction it's not designed for, but it's hard to communicate how much frustration the lack of this feature causes when trying to create consistency across development stages whilst also having some infra modules not deployed to some environments. Especially since if you try to retrospectively introduce a count meta-argument, since terraform often then wants to recreate all those resources (I realise it tries to move them, but this doesn't always work). It's my top WTF item since delving into devops. I understand where the language developed from and why this doesn't exist, but it's now impossible to create consistent but differing environments with anything close to DRY code.

In theory, everything works all the time. In practice, the theory never works out quite right.

The world is a messy place, with changing requirements and a multitude of technical and non-technical reasons. I've been using terraform for a long time and find myself in a new shop, completely retro-fitting all their services and teams with infrastructure-as-code and modern devops practices.

Having used count = 0 in the past and having had a terrible time as a result (all the things @chrs-myrs mentioned), I was hoping the language had maybe finally evolved some kind of easy-to-use conditional, but alas, I find the debate still rages on.

Conditional logic is always going to be the fastest and simplest way to "gate" something from an environment you really can't have it running, in a code-base you really can't be dramatically modifying for one small workaround. We all want perfectly factored, beautiful codebases managing clean, immutable infrastructure.... but a lot of the world is not there yet and needs some additional time and help getting there. Help like "basic if conditionals" to implement crude feature-flags.

I find my lazy workaround is mving .tf files in and out of the directory based on the environment I'm targeting. It's exceptionally crude and not a real source-code based solution, but for the standalone-sysadmin, it might get you by.

@apparentlymart
Copy link
Member

Hi again all!

Lately I've noticed that this request is captured in a few different issues, which is causing the design discussion to get fragmented. I'm going to close this one in favor of #21953 because that one has the most existing discussion and the most existing 👍 upvotes.

If you currently have an upvote attached to this issue then I suggest transferring it over to #21953 so we can track all of that in a central place. Thanks!

@apparentlymart apparentlymart closed this as not planned Won't fix, can't repro, duplicate, stale Aug 29, 2022
@crw crw added the duplicate issue closed because another issue already tracks this problem label Aug 31, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
config duplicate issue closed because another issue already tracks this problem enhancement
Projects
None yet
Development

No branches or pull requests

9 participants