Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock file handling strategy is not best practice? #1653

Open
james-skinner-deltatre opened this issue Apr 23, 2021 · 6 comments · May be fixed by #2889
Open

Lock file handling strategy is not best practice? #1653

james-skinner-deltatre opened this issue Apr 23, 2021 · 6 comments · May be fixed by #2889
Labels
enhancement New feature or request

Comments

@james-skinner-deltatre
Copy link

I am looking at adopting terragrunt into our workflow. Everything looks good to me so far except the advice and behaviour around handling .terraform.lock.hcl files does not seem right.

I have read over the related issue #1498 and I think this comment hits the nail on the head.

The lock file exists so that I can pull down dependencies for a specific version of my codebase in a repeatable way. So that means when I have my_module@v1.0.0 and run terraform init I get the same provider versions, whether I am running tests, deploying to dev or stage or prod etc.

This means the lock file belongs with the module code - i.e. versioned alongside the provider constraints and the rest of the code.

The terragrunt docs on the other hand, imply the lock file should live with the environment configuration. If you do this then:

  • yes, each time you apply the same release version to the same environment, you run with the same provider versions as last time. I can see a benefit there.
  • However, when you update or add a new provider into the module, as you role that new module version out, each environment you deploy to could get a different version of that provider. This is exactly what the lock file is there to help avoid.

From what I can see, in the scenario where the lock file lives in the module code, terragrunt does respect the .terraform.lock.hcl file rather than overwriting it, which is great. However I think the following changes would be positive:

  1. The docs should advise to keep the lock file with the module code as best practice and only use the functionality to copy the lock file between cache and working directory when that is not an option.
  2. If a referenced module already contains a .terraform.lock.hcl file alongside the code, the copying functionality should be disabled

...or maybe I am missing something here?

@brikis98
Copy link
Member

brikis98 commented May 3, 2021

Thanks for raising this discussion!

I suspect that different users will want to handle lock files differently:

  1. Some will want it to be handled with the module, so that everywhere the module is used, you get the exact same provider versions. This has the advantage of using the same provider versions you tested the module with everywhere you deploy the module.
  2. Others will want it to be handled with the live environments, so that each environment is pinned to specific versions. This has the advantage of allowing different environments/teams/use cases to use different provider versions.

I suspect (1) is the best practice for most, so I'd be in favor of this update:

  1. The docs should advise to keep the lock file with the module code as best practice and only use the functionality to copy the lock file between cache and working directory when that is not an option.

On the other hand, this change might interfere with (2):

  1. If a referenced module already contains a .terraform.lock.hcl file alongside the code, the copying functionality should be disabled

Perhaps this is one place where we add a flag that skips the lock file copying functionality. Or, vice versa, enables the copying functionality, with the default being off.

@brikis98 brikis98 added enhancement New feature or request help wanted labels May 3, 2021
@james-skinner-deltatre
Copy link
Author

Perhaps this is one place where we add a flag that skips the lock file copying functionality. Or, vice versa, enables the copying functionality, with the default being off.

That sounds like the right solution. My concern would be what the default is - at the moment I suspect somebody who isn't giving much thought to this will read the docs (or not) and end up with scenario 2 when they almost certainly would be better off with scenario 1.

To be honest I don't see much of a use case for scenario 2 but maybe I can't see past my own experience here. Where I have used lock files in the past, it is analogous with scenario 1. For example:

  • A node.js application, which you write and deploy to various environments, will have a lock file checked into version control
  • A module (as in npm module) for that application will specifically not have a lock file bundled with it.

Drawing parallels with this in the context of terraform:

  • a terraform codebase referenced by terragrunt config (lets say it provisions a microservice, a database, some other related resources) is akin to the node.js application. I would call this a "component".
  • a terraform module used by this component and potentially others (e.g. provisioning just the database) is akin to the npm module.

...but maybe not every use case is as simplistic as this. Or for some others the terragrunt config functions as the "component"

@brikis98
Copy link
Member

brikis98 commented May 4, 2021

I think the key use case you're missing is that people often use Terragrunt to generate code. This includes generating provider blocks, backend blocks, and sometimes resources, data sources, and module blocks. In these scenario, a checked-in lock file doesn't help, as it doesn't have all the info. So there, you'd have to keep the lock file with the terragrunt.hcl.

It's worth mentioning that in most programming environments, including Node.js, there are actually multiple lock files. E.g., The open source project express.js has its own lock file; and my project, which depends on express.js has its own lock file too. I don't know if Terraform properly handles this multi-layer lock file approach...

@james-skinner-deltatre
Copy link
Author

I think the key use case you're missing is that people often use Terragrunt to generate code

yep, you're right. I guess a rule of thumb is the lock file lives with the provider version constraints definition - i.e. the required_providers block

Not sure its worth debating here but not sure you are right about the multi-layer lock files, at least in the npm case. In fact it is not possible to publish the standard lock file in an npm module. There is a different lock file which you can publish but this is only for special cases.

Both modules and applications will have package.json files but this only covering version contraints, so equivalent to the required_providers block

@mrparkers
Copy link

I agree with @james-skinner-deltatre here, it really seems appropriate to version terraform lockfiles within the underlying modules instead of within the live repository that contains the terragrunt.hcl file. It seems odd that both the module and the "live" environments need to be concerned with terraform provider versions:

  • modules need to specify terraform.required_providers
  • live environment needs to specify a .terraform.lock.hcl alongside the terragrunt.hcl configuration.

It would be much easier if the live environment only had to be concerned with the version of the terraform module that's being ran, that's it. Then, the module could handle all of the required terraform providers and their exact versions.

I like @brikis98's suggestion of adding a flag to skip the lockfile copying functionality, or perhaps adding it as a configuration attribute within the terragrunt.hcl, similar to the iam_role attribute.

I'd be happy to open a PR with this functionality if it would be welcomed.

@lqc
Copy link

lqc commented Feb 18, 2024

Drawing parallels with this in the context of terraform:

  • a terraform codebase referenced by terragrunt config (lets say it provisions a microservice, a database, some other related resources) is akin to the node.js application. I would call this a "component".
  • a terraform module used by this component and potentially others (e.g. provisioning just the database) is akin to the npm module.

...but maybe not every use case is as simplistic as this. Or for some others the terragrunt config functions as the "component"

This analogy breaks when you consider what @brikis98 said - the referenced module that provisions a microservice is not akin to a Node.js App, because in most cases you can't even apply it without all the config generated by Terragrunt.

It would be great if Terragrunt had better support for lock files, rather than more options to disable them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants