Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configs: Refined error messages for mismatched provider passing #30639

Merged
merged 1 commit into from Mar 10, 2022

Conversation

apparentlymart
Copy link
Member

@apparentlymart apparentlymart commented Mar 10, 2022

This set of diagnostic messages is under a number of unusual constraints that make them tough to get right:

  • They are discussing a few finicky concepts which authors are likely to be encountering for the first time in these error messages: the idea of "local names" for providers, the relationship between those and provider source addresses, and additional ("aliased") provider configurations.
  • They are reporting concerns that span across a module call boundary, and so need to take care to be clear about whether they are talking about a problem in the caller or a problem in the callee.
  • Some of them are effectively deprecation warnings for features that might be in use by a third-party module that the user doesn't control, in which case they have no recourse to address them aside from opening a feature request with the upstream module maintainer.
  • Terraform has, for backward-compatibility reasons, a lot of implied default behaviors regarding providers and provider configurations, and these errors can arise in situations where Terraform's assumptions don't match the author's intent, and so we need to be careful to explain what Terraform assumed in order to make the messages understandable.

After seeing some confusion with these messages in the community, and being somewhat confused by some of them today myself, I decided to try to edit them a bit for consistency of terminology (both between the messages and with terminology in our docs), being explicit about caller vs. callee by naming them in the messages, and making explicit what would otherwise be implicit with regard to the correspondences between provider source addresses and local names.

My assumed audience for all of these messages is the author of the caller module, because it's the caller who is responsible for creating the relationship between caller and callee. As much as possible I tried to make the messages include specific actions for that author to take to quiet the warning or fix the error, but some of the warnings are only fixable by the callee's maintainer and so those messages are, in effect, a suggestion to send a request to the author to stop using a deprecated feature.

I think these new messages are also not ideal by any means, because it's just tough to pack so much information into concise messages while being clear and consistent, but I hope at least this will give users seeing these messages enough context to infer what's going on, possibly with the help of our documentation.

I intentionally didn't change which cases Terraform will return warnings or errors -- only the message texts -- although I did highlight in a comment in one of the tests that what it is a asserting seems a bit suspicious to me. I don't intend to address that here; instead, I intend that note to be something to refer to if we later see a bug report that calls that behavior into question.

This does actually silence some unrelated warnings and errors in cases where a provider block has an invalid provider local name as its label, because our other functions for dealing with provider addresses are written to panic if given invalid addresses under the assumption that earlier code will have guarded against that. Doing this allowed for the provider configuration validation logic to safely include more information about the configuration as helpful context, without risking tripping over known-invalid configuration and panicking in the process. Effectively, an invalid label on a provider block will mask any reports of problems with the contents of that provider block as of this change.


Although most of these are just rewordings of existing messages, possibly with some additional substitution values to give more information, I did end up making a more involved change to the message about the deprecation of proxy provider blocks, in the hope of making it a little more directly-actionable by collecting together all of the proxies for a particular provider and reporting them all as a single message containing an example:

╷
│ Warning: Redundant empty provider block
│ 
│   on child/providers-pass-explicit-child.tf line 13:
│   13: provider "google-beta" {
│ 
│ Earlier versions of Terraform used empty provider blocks ("proxy provider
│ configurations") for child modules to declare their need to be passed a
│ provider configuration by their callers. That approach was ambiguous and
│ is now deprecated.
│ 
│ If you control this module, you can migrate to the new declaration syntax
│ by removing all of the empty provider "google-beta" blocks and then
│ adding or updating an entry like the following to the required_providers
│ block of module.child:
│     google-beta = {
│       source = "hashicorp/google-beta"
│       configuration_aliases = [
│         google-beta.baz,
│         google-beta.bleep,
│         google-beta.boop,
│       ]
│     }
╵

I like this new approach because I think it helps clarify that the various separate provider "google-beta" blocks should all be merged into a single entry in required_providers, whereas before we'd return a separate message for each one that were all similar to one another and tried to explain all of the individual parts of a required_providers entry that might be involved, without showing the big picture of what it all looks like assembled.

However, one downside of this approach is that this single diagnostic can have only one source location per distinct provider, and so it arbitrarily chooses the block whose location is lexically least. I think this is a valid compromise because it seems likely that all of the configurations for a particular provider will be colocated in most reasonable Terraform configurations, and in practice it's rare for there to be more than two or three and so in the worst case the user can remove the first one to get an error about the second, and continue in that way until they are all gone.

@@ -5,3 +5,10 @@ provider "test" {
module "mod" {
source = "./mod"
}

# FIXME: This test case seems a little suspicious, because this module is
Copy link
Member

@jbardin jbardin Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is representative of a case that never worked correctly. Some older versions would use the wrong provider non-deterministically, and later they would return a "provider not found" error. As part of the overall provider work we decided that in order to reference a provider by name (rather than implicitly), that provider must be declared within the configuration. This also allows for static validation tooling to know what provider configuration types are allowed, in the same way we can determine what input variable types are allowed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trouble I was having with this particular requirement is that a seemingly-internal-only change to the child module can cause a warning to appear for an existing caller who was previously validly relying on the normal inheritance behavior, which makes this requirement hard to explain and justify in the warning message text: the audience of the warning is the author of the caller, but it was the author of the child module who caused the problem, probably unknowingly because this requirement is subtle and we don't have any way to proactively warn the child module author about it before publication.

I understand that it's a compatibility constraint and so I'm not proposing to make any changes to this behavior for the foreseeable future (perhaps in a future language edition we will require explicit provider requirements/ passing in all cases to avoid this) but I'd like to change this comment to describe the constraint you are describing for the benefit of future maintainers who might see a bug report for this inconsistency and want to determine why Terraform behaves this way.

I'm going to update this comment to capture something like what you have explained here, but otherwise leave this unchanged.

This set of diagnostic messages is under a number of unusual constraints
that make them tough to get right:
 - They are discussing a couple finicky concepts which authors are
   likely to be encountering for the first time in these error messages:
   the idea of "local names" for providers, the relationship between those
   and provider source addresses, and additional ("aliased") provider
   configurations.
 - They are reporting concerns that span across a module call boundary,
   and so need to take care to be clear about whether they are talking
   about a problem in the caller or a problem in the callee.
 - Some of them are effectively deprecation warnings for features that
   might be in use by a third-party module that the user doesn't control,
   in which case they have no recourse to address them aside from opening
   a feature request with the upstream module maintainer.
 - Terraform has, for backward-compatibility reasons, a lot of implied
   default behaviors regarding providers and provider configurations,
   and these errors can arise in situations where Terraform's assumptions
   don't match the author's intent, and so we need to be careful to
   explain what Terraform assumed in order to make the messages
   understandable.

After seeing some confusion with these messages in the community, and
being somewhat confused by some of them myself, I decided to try to edit
them a bit for consistency of terminology (both between the messages and
with terminology in our docs), being explicit about caller vs. callee
by naming them in the messages, and making explicit what would otherwise
be implicit with regard to the correspondences between provider source
addresses and local names.

My assumed audience for all of these messages is the author of the caller
module, because it's the caller who is responsible for creating the
relationship between caller and callee. As much as possible I tried to
make the messages include specific actions for that author to take to
quiet the warning or fix the error, but some of the warnings are only
fixable by the callee's maintainer and so those messages are, in effect,
a suggestion to send a request to the author to stop using a deprecated
feature.

I think these new messages are also not ideal by any means, because it's
just tough to pack so much information into concise messages while being
clear and consistent, but I hope at least this will give users seeing
these messages enough context to infer what's going on, possibly with the
help of our documentation.

I intentionally didn't change which cases Terraform will return warnings
or errors -- only the message texts -- although I did highlight in a
comment in one of the tests that what it is a asserting seems a bit
suspicious to me. I don't intend to address that here; instead, I intend
that note to be something to refer to if we later see a bug report that
calls that behavior into question.

This does actually silence some _unrelated_ warnings and errors in cases
where a provider block has an invalid provider local name as its label,
because our other functions for dealing with provider addresses are
written to panic if given invalid addresses under the assumption that
earlier code will have guarded against that. Doing this allowed for the
provider configuration validation logic to safely include more information
about the configuration as helpful context, without risking tripping over
known-invalid configuration and panicking in the process.
@apparentlymart apparentlymart merged commit 1879a39 into main Mar 10, 2022
@apparentlymart apparentlymart deleted the f-provider-passing-consistency branch March 10, 2022 18:06
@github-actions
Copy link

Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch.

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants