-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate duplicate provider local names in required_providers
#31218
Conversation
f6ddbc8
to
ce5758f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a nice improvement to me that should give some better feedback about a situation that was already invalid but checked and reported in a confusing way.
I do have a few questions about the overall impact of this change:
-
Were you able to verify that there weren't any situations before where someone might have been able to "get away with" declaring multiple provider local names in this way without hitting some sort of error?
I cannot imagine any case that should've worked, because declaring two providers with different local names and then configuring against them both means exactly the same thing as declaring two provider configurations with the same local name and alias. But given that this was unambiguous, I do wonder if there's a possibility that something might have crept through that could be arguably covered by the v1.0 compatibility promises. 😬
-
How does this interact with the various implied provider requirements Terraform infers for backward compatibility? For example:
terraform { required_providers { notaws = { source = "hashicorp/aws" } } } provider "notaws" { # ... } provider "aws" { # ... }
I think your
TestProviderConfigTransformer_duplicateLocalName
test is asserting that this is allowed, but that seems counter-intuitive to me since it seems like it's still effectively declaring thehashicorp/aws
provider under two local names, with one of them just being implied by our back-compat rules rather than written out explicitly by the user.Might we need an additional rule in
ProviderConfigTransformer
to treat the above as invalid, or does that already get taken care of by some logic I can't see clearly in this diff?
Hmm, actually that test example came from my first iteration where this was warning, but I found that because there were multiple points where providers were located only by their full name, you couldn't configure them without possibly running into errors. But now I see the problem here, it's technically possible to have a completely working configuration if you're not configuring the provider at all (or maybe not even using the provider!). There would be no reason to write such a config in the first place, but maybe a module could end up there through some refactoring? 🤔 I think I'm going to loosen this back up to a warning to be on the safe side. |
Adding multiple local names for the same provider type in required_providers was not prevented, which can lead to ambiguous behavior in Terraform. Providers are always indexed by the providers fully qualified name, so duplicate local names cannot be differentiated.
We can skip providers which already have a node in the graph for their type.
ce5758f
to
d7238d5
Compare
// was required via a different name. | ||
impliedLocalName := req.Type.Type | ||
// We have to search through the configs for a match, since the keys contains any aliases. | ||
for _, pc := range mod.ProviderConfigs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the related situation of a provider requirement being implied by a resource
or data
block which doesn't set the provider
argument and whose resource type name starts with a prefix that doesn't match any explicitly-declared identifier. Should we catch that one too, or does it end up not mattering because the resource ends up belonging to the correct provider configuration anyway?
(Specifically: I don't recall if that backward-compatibility behavior causes us to implicitly declare a new provider configuration with the local name matching the prefix, or if it instead just hooks it up to whatever provider configuration it finds that has the assumed provider source address. If it's the latter then I guess it probably doesn't really matter, although it's still potentially confusing to have e.g. aws_instance
automatically associated with provider "notaws"
just because required_providers
declared notaws
as being hashicorp/aws
. 🤔)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it does work out that the implied provider based on the resource prefix will just get hooked up to the correct provider, even if it has a different local name, perhaps it's best to warn about that case too? I didn't delve into that here because it doesn't cause any known issues, and it's not as straightforward as dealing with the provider configs alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it seems a little unclear what we should do here... I do like the consistency of treating a provider dependency implied by a resource the same way as a provider dependency implied by a provider configuration, since it seems like both of those things are a common source of confusion for folks anyway. But if it is particularly gnarly to do it then I would agree probably not worth it.
I have some memory of a function somewhere which takes something describing a provider and encapsulates the logic for deciding which provider configuration it ought to belong to, which if we could find it might allow all of these situations together as one rule (a provider = aws.foo
can also imply a dependency on hashicorp/aws
if there isn't already an aws
in scope), but if we'd end up having to duplicate all of the logic for detecting implicit provider requirements in here then I'd say that's probably too high a cost for just a little consistency. (Unfortunately I don't remember where this function is, assuming I'm not misremembering it existing in the first place, but maybe you have seen it on your travels while working on this... 🤔)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking this final case wasn't that bad, we've already collected the required_providers
local names, so we just need to iterate over the resources looking for implied providers which aren't in that set.
The newest commit adds a diagnostic like:
│ Warning: Duplicate required provider
...
│ Provider "null" was implicitly required via resource "null_resource.a", but listed in
│ required_providers as "nope". Either the local name in required_providers must match the resource
│ name, or the "nope" provider must be assigned within the resource block.
This diagnostic output only uses the local names, because this situation can only happen with providers from the default namespace with a default provider name inferred from the resource name.
23eb9bd
to
35f082a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry... more noodling here as I dug into the new logic you wrote. Each time I read new code in this PR it reminds me of things I'd forgotten about the provider resolution logic, and so I'm sorry I'm kinda just dribbling context slowly over the course of many round-trips. 😖
"Provider %q was implicitly required via resource %q, but listed in required_providers as %q. Either the local name in required_providers must match the resource name, or the %q provider must be assigned within the resource block.", | ||
localName, r.Addr(), prevLocalName, prevLocalName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subjectively, I think it'd help a little to make that first %q
be defAddr
instead of just localName
, even though we know that it'll always just be hashicorp/
followed by the localName
anyway, just to add a little nudge here for those who aren't yet aware of this implied dependency mechanism and might benefit from us linking it with what they can see in the terraform init
and terraform providers
output.
Probably we'll refine this message some more once we see what sorts of questions people ask about it, since it's hard to guess right now what a typical recipient of this message might know or not know, so otherwise I think this is a fine place to start!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually had that at first, but changed it as I cut the message down for brevity ;) THat's a fair point though, and will put it back.
// We're looking for resources with no specific provider reference | ||
if r.ProviderConfigRef != nil { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we also looking for resources that explicitly specify a provider local name that wasn't declared?
provider = aws.foo
I believe (though I haven't tested) that if we encounter the above without there being an aws
in required_providers
then we treat it as an implied requirement for hashicorp/aws
, just as we would if it were not set at all but the type name begins with "aws".
Resource.ProviderConfigAddr
seems to be the definition of which local name a particular resource is associated with, and so maybe we can use that here and then take the LocalName
field of the result as the local name to use below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that is validated already which is why I didn't check it here, but not until core tries to connect the providers in the graph.
This additional check was only intended to get the case of an implied local name not matching the required_providers
local name, in which which case ProviderConfigAddr
hides the fact that provider
wasn't set. I guess we could also add more validation to resource provider
arguments during config loading, but that was outside of the goal of this PR.
continue | ||
} | ||
|
||
defAddr := addrs.NewDefaultProvider(localName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could instead be a call to Module.ImpliedProviderForUnqualifiedType
, which I believe aims to encapsulate the logic for "what is the fallback to use if this doesn't match an explicit provider requirement?".
That does also consult the required providers for the module, and so calling it directly wouldn't allow distinguishing between a reference to an explicit declaration vs. a backward-compatibility fallback. So maybe then the better answer would be to skip over that layer and go directly to addrs.ImpliedProviderForUnqualifiedType
, which at least deals with the special case that terraform
means terraform.io/builtin/terraform
rather than hashicorp/terraform
. Or maybe we could change the signature of Module.ImpliedProviderForUnqualifiedType
to return an additional result that signals explicitly whether this was taken from an explicit required_providers
declaration or just a compatibility fallback, in the interests of keeping all of the logic using mostly the same codepaths. 🤔
Generally I'd encourage leaning on these existing methods as much as they make sense, because they are there to encapsulate all of the little oddities that arose in the process of developing the provider source scheme, and so if we skip over them and try to reimplement their logic out here it's likely that we'll miss some edge case that we've forgotten about in the meantime and make this not behave consistently with the logic elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could add another return parameter to ImpliedProviderForUnqualifiedType
to show how it was resolved, but I purposely did avoid that here. The validation is only looking for undeclared implied provider names which are named something different in required_providers
. This means we have to skip the automatic resolution mechanisms like ImpliedProviderForUnqualifiedType
, since those are what would connect us to the provider declaration that we are trying to avoid.
I was kind of view this from the perspective that the methods we present like Module.ImpliedProviderForUnqualifiedType
rely on the underly structured data being valid, but here we are doing the actual validation of that data.
Though here I did mean to use addrs.ImpliedProviderForUnqualifiedType
, but ended grabbing the older NewDefaultProvider
function. That shouldn't make any difference in this case since you can't add the internal terraform provider into required_providers
, but yes, we should continue to use them more consistently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember us making any special effort to treat source = "terraform.io/builtin/terraform"
as invalid, so as far as I know you can include it there if you really want to give it a local name other than terraform
, though of course that would be a pretty bizarre thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah, you can, not sure why I thought we disallowed that 🤔
35f082a
to
69f0c7d
Compare
69f0c7d
to
57c0deb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! Thanks for indulging me on my gradual trip through reloading the context about how all of this works in order to actually review it. 🙄
Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch. |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
Adding multiple local names for the same provider type in
required_providers
was not prevented, which can lead to ambiguous behavior in Terraform. Providers are always located via the provider's fully qualified name (even using the full name as a map key in many places), so duplicate local names cannot be differentiated. This can lead terraform to non-deterministically configure the incorrect provider in some cases.Because it is currently possible that a configuration with providers which themselves do not need explicit configuration may have been working with duplicates, we cannot turn these into errors. Adding multiple entries for the same provider now results in a warning like so:
We can also check for the situation where a user required a provider by a name different from the default, but attempted to configure that provider via the default local name. This can help prevent the case where a provider configuration may apparently not always be taking effect within a module. The warning here is worded slightly differently:
fixes #31196