-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable installation strategy for external modules, similar to what we have for providers #31134
Comments
Thanks for the report. Although it isn't documented (cc @laurapacilio), you can put a host "registry.terraform.io" {
services = {
"providers.v1" = "https://host.example.com/v1/providers/",
}
} To confirm, is this what you meant by "the registry URL is configurable" - did you try this and see it not working for submodules? |
FWIW that Entirely overriding a particular host with your own services could work, but you'd need to keep referring to the upstream discovery document to see if the host has started supporting any new protocols or protocol versions in case you want to update your local copy; it might make later versions of Terraform fail in strange ways if they are relying on new host services not present today. The example above will already prevent installation of modules from the public registry, for example, because it would make Terraform think that there is no module registry on that host. I think perhaps a more appropriate solution for this particular use-case would be to tell Terraform to install providers using the See Provider Network Mirror Protocol for more information. I'll leave this open so we can debate whether we want to prominently document the |
I just want to make it clear that since the |
yes that is what I meant as being configurable but as pointed out is not documented. |
I'm not opposed to adding it to the documentation if:
We would need to include a warning though explaining any potential side effects of use that folks could run into. @apparentlymart and @kmoe If you think this is the best way forward, let me know and I can help open a PR to add this to our docs. Thanks @jamengual! |
I know I will sound like a broken record and I'm sorry but documenting this does not fulfill the use-case. the |
Sorry for not reading clearly and assuming you were talking about providers. As some context for others reading, there was a parallel discussion about this with @jamengual in the HangOps Slack which had started with a question about I'm going to try to elaborate here on some of the comments I made in the HangOps thread, both to try to make the points more clearly and also to record them here for posterity since this will be a location easier to find than a random thread in a Slack workspace. I think it's important here to notice that the string
For providers in particular, we designed a number of mechanisms for customizing the installation strategy to use different installation methods, including installation from a local directory in the filesystem or installation from a separate network service that Terraform treats as a "mirror" of providers from an origin registry. That then allows separating the identity use-case of the hostnames from the installation source use-case. Terraform assumes, but cannot completely enforce, that anyone using these strategies will ensure that the alternative installation methods will return identical packages as the origin registry would've for the same address. (There is some enforcement of this if you let Terraform install from the origin registry at first and let it record checksums in the dependency lock file, but if you exclusively use a mirror at all times then Terraform will essentially treat that mirror as authoritative.) Unfortunately, due to some long-standing technical debt there is no corresponding mechanism for customizing the installation methods for modules. We did originally intend to support similar mechanisms -- the dependency lock file, and custom installation methods/strategies -- for modules too. Unfortunately whereas Terraform's model for provider sources is a very strictly-specified address syntax with explicit meaning, Terraform's model for module sources is:write some sort of string into this argument and we'll use a bunch of heuristics to guess what you meant and always try to install something. Modules therefore don't have a reliable canonical identity for us to use in dependency lock files or in custom installation methods specified in the CLI configuration. Just simply allowing customizing Terraform's module address parser to assume a different default hostname when one isn't specified is not a sufficient solution to this empasse, because:
I suggest that we treat this issue as representing the well-known use-case of Terraform not supporting customizable installation methods for modules as we do for providers. I have a feeling we do already have some issue open for this somewhere, but I wasn't readily able to find it right now. Perhaps we'll find it later and can close this one as a duplicate once we do. I agree with @jamengual that overriding the service discovery for As with the provider address design, my proposed initial technical design requirements (subject to negotiation, of course) would be:
It will take some research and design work to get there, and we will probably need to allow ourselves some exceptions/oddities for the various bizarre non-registry source syntaxes Terraform has allowed since very early versions, but I believe it is solvable and that we should plan to solve it. |
Hello! First thank you, Martin (as always) for your very thorough and thoughtful explanation. Based on everything I'm seeing here, it does not seem like a docs quick-fix is the right way to go. I'm going to leave this issue open (of course!) so folks can find it and we can have a record of this conversation and the proposed work. But I'm going to remove the documentation label, as I think we've seen that this issue goes far beyond just being a documentation gap. Thank you all for the discussion! Please let me know if anyone disagrees. Thank you! |
I created a new issue because I didn't think it fit this one directly, but one of the requirements I have is simpler, the environment I am using terraform in does not have access to the internet and can not download content from online sources, so I don't want to replace the URL with another one where it has to hit a web server of some sort, I would like to point at a folder on disk. I want to be able to use modules that are developed by the community and have an easy way to mirror them + sub-modules they refer to, and have them in place on disk much like you can use The only solution I've got so far is to pull them, rewrite any |
I'm trying to find a solution to this exact same issue for more or less the same use case as @archoversight. I've spent the last couple of days going through the source code and built a custom version that lets me override the I think this issue is part of a broader one that is we would like to see better support in Terraform for air-gapped environments. |
I wrote a long comment above with various different concerns in it but I just want to reiterate the main tension in designing this: In situations where the module author and the Terraform operator (the person running However, the design here must also accommodate the situation where those two are different. For example, we need to consider what happens for a publicly-shared module that refers to a hostname-free address with the assumption that (as documented) it's a shorthand for A successful design to address this issue must, I think, allow both the module author to unambiguously express what they intend their module to depend on, and allow the operator to configure how to fetch those dependencies. If a module author writes However, an operator should be able to tell Terraform that they've mirrored
The provider installation method settings in CLI configuration offer a clear pattern for us to follow here if the mechanism is focused only on registry-based source addresses. The CLI configuration could include a block like this: module_installation {
network_mirror {
url = "https://example.com/terraform-modules/"
}
} ...which would then use that mirror for all registry modules, regardless of hostname. Or, to specify it more finely, it could instead specify: module_installation {
network_mirror {
url = "https://example.com/terraform-modules/"
include = ["registry.terraform.io/*/*/*"]
}
direct {
exclude = ["registry.terraform.io/*/*/*"]
}
} This would solve the problem for all module registries, rather than just However, there are two significant missing pieces here that also need to be solved:
If we can convince ourselves that it's acceptable to limit both dependency lock file tracking and custom installation methods only to registry-shaped module addresses then I think we'd have a pretty clear path forward here. I've not yet done any research to see if that compromise is plausible. I'd be interested in feedback either way from those who are interested in this issue. |
I'm totally okay with being forced to use a registry-style address to be able to support this as long as the short version points to the long version address which is configurable (via one of your samples above). |
Coming over from #29362 (comment), it sounds like this proposal will not address rep'ping the source with a local path override. Are there any creative alternatives that come out of the current thinking? File URLs? The use-case I have is that published example modules, or modules included in provider examples/ directory, have to include user directions to change the source address depending on how the module is being consumed (clone, registry, e2e testing). |
this issue is more related to the fact that you can't have modules dependencies being pulled from a hosted registry, if we think of it from that point of view the fact that you have an If you think of it from the software development point of view, the test of an app usually lives on the same repo as the app and sometimes integration tests will live in another repo and the pipeline will trigger those steps independently and continue with the SDLC of the app. If you think about your integration test module in the |
I appreciate your take, @jamengual. I can see benefits to publishing the modules that are consumed by provider E2E tests. (terraform-provider-foo/examples run by terragrunt). For E2E tests of standalone modules I suppose the recommendation would be to track separate examples/ and tests/, where the module when referenced in examples/ (terraform-foo-moda/examples/moda-ex1/) would depend on the registry and tests/ use local source paths. |
I showed I don't see any reason why we couldn't also support There is a separate question of what we might call "development overrides", which we support for providers today in a special way that just tells commands like A nice thing about only supporting module-registry-style addresses is that all of these design ideas for providers can in theory be copied over relatively unchanged, aside from the simplification that modules are treated by Terraform as platform-agnostic and so we don't have to worry about multiple "builds" of the same module as we do for providers. However, I'd prefer to focus only on the "mirroring" use-case for this issue, and then we can think about a story for "development overrides" separately later, which could just copy what we did for providers or we could use that opportunity to design something a little more holistic, like Rust's Cargo Workspaces or |
+1 Would this restriction affect sourcing upstream modules from a local "network mirror" that embed within them relative pathing to source nested submodules? e.g. https://github.com/apparentlymart/terraform-aws-tf-registry/blob/v0.0.1/store.tf#L2 |
One detail that makes this a little tricky is the existing distinction between "module packages" and "module sources", which is something that is largely hidden in the details today but would probably end up more exposed if we implemented support for mirrors. The easiest way to see the difference between a module source and a module package is to consider a source address like Unfortunately this package vs. source distinction has an extra wrinkle for module registry addresses. A module registry is really just an extra indirection over physical source package addresses: if I ask the public Terraform Registry about The result of the registry protocol is another source address, and so although the above example doesn't do this it's valid in principle for a module registry to indicate that the underlying source is With all of that in mind, part of what we'll need to design here is what exactly a network mirror is returning. If we design the network mirror protocol by the same principles as the main registry protocol then the mirror will really just be an index of physical source addresses, in which case Terraform can treat them just the same way as the ones returned by the registry itself. I expect that's the most likely design for network mirrors. We will also need to design the structure of a filesystem mirror, which makes things a little more tricky because I expect most would want a filesystem mirror to contain literally the source code of the module, rather than just a source address for Terraform to retrieve from elsewhere. For any module registry that would return a sub-path of a package as the location of a module, we'd need some way for the filesystem mirror to contain that same metadata. I expect it's doable, but still requires some consideration. A filesystem mirror for a registry module might require a small amount of additional metadata that isn't needed for a provider mirror where we can assume that "provider package" is an indivisible unit always referred to as a whole. My point in mentioning all of this is that this source vs. package deal is also how Terraform deals with relative sources like |
So, there's a lot of discussion on this, but I'm just curious if any progress has been made here? Our organization would also benefit greatly from being able to manage modules and module mirrors more like providers. |
Current Terraform Version
All
Use-cases
Make
registry.terraform.io
a configurable parameter instead of a constant to be able to use a module/submodule internally hosted registry.When using a module like so :
the source URL basically translates to :
if the constant mentioned in L24 was configurable it would be possible to serve the
.well-known/terraform.json
with the URL of the module registry and index pointing to an internal repo.Right now the registry URL is configurable BUT the problem is that when using modules in the registry that use the short notation ie.
source = "cloudposse/alb/aws"
and that root module calls other submodules using the short notation then the root module will be pull from the internal configured registry URL by doing something likesource = "pepe.myrepo.com/cloudposse/alb/aws"
but the submodule will still have the short notation pointing to the registry and then the internally hosted index will not be used.This is a very well used pattern in many languages were the repo of the package dependencies libraries can be configured and pointed to hosted version on products like jfrog artifactory, Nexus IQ, S3 and so on.
Attempted Solutions
It is not possible to configure at the moment and the only way to do it is to hack SSL CAs and hots tables to make this work which is definitely not a good solution.
Proposal
make the default registry URL https://registry.terraform.io configurable via config file in .terraform.rc or a ENV variable.
References
https://github.com/hashicorp/terraform/blob/main/internal/addrs/provider.go#L24
https://github.com/apparentlymart/terraform-aws-tf-registry
The text was updated successfully, but these errors were encountered: