New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype: Which resources "contributed to" the proposed plan? #28944
Conversation
We've ended up implementing something approximately like this in a few places now, so this is a centralized version that we can consolidate on moving forward, gradually removing that duplication.
Previously the "providers" package contained only a type for representing the schema of a particular object within a provider, and the terraform package had the responsibility of aggregating many of those together to describe the entire surface area of a provider. Here we move what was previously terraform.ProviderSchema to instead be providers.Schemas, retaining its existing API otherwise, and leave behind a type alias to allow us to gradually update other references over time. We've gradually been shrinking down the responsibilities of the "terraform" package to just representing the graph components and behaviors anyway, but the specific motivation for doing this _now_ is to allow for other packages to both be called by the terraform package _and_ work with provider schemas at the same time, without creating a package dependency cycle: instead, these other packages can just import the "providers" package and not need to import the "terraform" package at all. For now this does still leave the responsibility for _building_ a providers.Schemas object over in the "terraform" package, because it's currently doing that as part of some larger work that isn't easily separable, and so reorganizing that would be a more involved and riskier change than just moving the existing type elsewhere.
Our existing functionality for dealing with references generally only has to concern itself with one level of references at a time, and only within one module, because we use it to draw a dependency graph which then ends up reflecting the broader context. However, there are some situations where it's handy to be able to ask questions about the indirect contributions to a particular expression in the configuration, particularly for additional hints in the user interface where we're just providing some extra context rather than changing behavior. This new "globalref" package therefore aims to be the home for algorithms for use-cases like this. It introduces its own special "Reference" type that wraps addrs.Reference to annotate it also with the usually-implied context about where the references would be evaluated. With that building block we can therefore ask questions whose answers might involve discussing references in multiple packages at once, such as "which resources directly or indirectly contribute to this expression?", including indirect hops through input variables or output values which would therefore change the evaluation context. The current implementations of this are around mapping references onto the static configuration expressions that they refer to, which is a pretty broad and conservative approach that unfortunately therefore loses accuracy when confronted with complex expressions that might take dynamic actions on the contents of an object. My hunch is that this'll be good enough to get some initial small use-cases solved, though there's plenty room for improvement in accuracy. It's somewhat ironic that this sort of "what is this value built from?" question is the use-case I had in mind when I designed the "marks" feature in cty, yet we've ended up putting it to an unexpected but still valid use in Terraform for sensitivity analysis and our currently handling of that isn't really tight enough to permit other concurrent uses of marks for other use-cases. I expect we can address that later and so maybe we'll try for a more accurate version of these analyses at a later date, but my hunch is that this'll be good enough for us to still get some good use out of it in the near future, particular related to helping understand where unknown values came from and in tailoring our refresh results in plan output to deemphasize detected changes that couldn't possibly have contributed to the proposed plan.
This was an attempt to try to report more context in an unknown for_each error by dragging unknown value origin information through the normal expression evaluation process using cty marks. However, it doesn't actually work because Terraform evaluates each expression separately in its own evaluation context, and so this trick of sneakily re-evaluating an expression with marks only when we hit an error can only ever find shallow references, which we can already find statically anyway. I'm just pushing this up to keep it for a possible second attempt later. I think a second attempt would need something more like the analyzer I built in #28944, but using marks and expression evaluation to deal with its analyses instead of the custom static technique I tried there.
This ultimately evolved into #30486, so we don't need this draft anymore. |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
This is a prototype of some internal analysis algorithms to allow us to ask more "global" questions about what refers to what within a configuration, considering also indirect references.
This is not a final, shippable thing: the underlying algorithms are not complete for all possible expression types, the unit tests are lacking, and it's probably quite slow (though I've not measured). However, it was an experiment in ways to deal with some problems whose solutions seem to involve asking "what contributed to this expression?":
for_each
expression is invalid because it's unknown, it would help to know which resource references elsewhere in the configuration contributed to that result, because they'll probably be a good clue as to why the value ended up being unknown.I used the second of the above as the vehicle to test this, and so also in this branch is a prototype of a possible adjustment to the "changed outside of Terraform" part of the plan UI where it'd only show a one-line summary if it detected a change to a resource that doesn't seem to have contributed to any of the proposed change actions, and thus in turn emphasize the ones that could have contributed.
Some examples of that (although this illustrated functionality isn't complete, decided, nor really the main point of this prototype)...
Detected a change that seems relevant:
Detected a change that doesn't seem relevant:
A mixture of relevant and not-so-relevant changes:
A more interesting case where the second resource's configuration is derived from the first, and so the first is "relevant" even though it doesn't have any changes planned directly itself:
A refresh-only plan always overrides all of this and just shows everything detected in full though, because detecting such changes is the whole point of a refresh-only plan.