Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module versions #16466

Merged
merged 13 commits into from
Oct 27, 2017
Merged

Module versions #16466

merged 13 commits into from
Oct 27, 2017

Conversation

jbardin
Copy link
Member

@jbardin jbardin commented Oct 26, 2017

This PR contains a bunch of necessary refactoring to get the module versioning enabled. There were a number of latent bugs in module handling, that really only arise if you change the source of intermediate level modules. This would have been a very rare event in with module previously, but now that versioning can cause module to change without changing their name, path, or source, we needed to overhaul the storage layer. The refactoring is not complete here, but this get's us into a more manageable state to move forward.

The primary changes are:

  • Metadata required for tracking the provenance of modules is added to the module.Tree. Every layer needs to have it's version and source recorded, so that submodules are always updated accordingly.
  • regsrc and response packages are copied from the registry, and updated to work with the svchost package. The difference can be reconciled later, and re-imported to the registry.
  • The handling of registry modules doesn't really fit with the go-getter interfaces since versioning needs to be taken into account. The getter.Detector for the registry is removed, and the multi-step process of locating the registry module is handled before delegating it to go-getter. The go-getter package is still responsible for fetching the final location returned from the registry.

TODO: More registry test coverage, UI cleanup, storage refactoring...

Submodules were located by using their module path as the storage key.
Now that modules may have versions, a submodule needs to know how to
locate the corect source depending on the versions of its ancestors in
the tree.

Add a version field to each Tree, and a pointer back to the parent Tree
to step back through the ancestors. The new versionedPathKey method uses
this information to build a unique key for each module, dependent on the
ancestor versions.

Not only do stored modules need to know their version if it exists, but
any relative source needs to know all the ancestor versions in order to
resolve correctly.
Adds basic detector for registry module source strings. While this isn't
a thorough validation, this will eliminate anything that is definitely
not a registry module, and split out our host and module id strings.

lookupModuleVersions interrogates the registry for the available
versions of a particular module and the tree of dependencies.
This test highlights how changing an intermediate source path prevents
reloading of submodules. While this is somewhat of an edge case now, it
becomes quite common in the cacse where module versions are updated.
The reason versionedPathKey didn't solve the previous test case was a
missing source string for the current module.
wire up HTTP so we can test the mock discovery service

test lookupModuleVersions

Add a versions endpoint to the mock registry, and use that to verify the
lookupModuleVersions behavior.

lookupModuleVersions takes a Disco as the argument so a custom Transport
can be injected, and uses that transport for its own client if it set.

test looking up modules with default registry

Add registry.terrform.io to the hostname that the mock registry resolves
to localhost.

ACC test looking up module versions

Lookup a basic module for the available version in the default registry.
The detection of registry modules will have to happen in mutliple
phases. The go-getter interface requires that the detector return the
final URL, while we won't know that until we verify which version we
need. This leaves the regisry sources broken, to be re-integrated in a
following commit.
Registry modules can't be handled directly by the getter.Storage
implementation, which doesn't know how to handle versions. First see if
we have a matching module stored that satisfies our constraints. If
not, and we're getting or updating, we can look it up in the registry.

This essentially takes the place of a "registry detector" for go-getter,
but required the intermediate step of resolving the version dependency.

This also starts breaking up the huge Tree.Load method into more
manageable parts. It was sorely needed, as indicated by the difficulty
encountered in this refactor. There's still a lot that can be done to
improve this, but at least there are now a few easier to read methods
when we come back to it.
If a provider configuration is inherited from another module, any
interpolations in that config won't have variables declared locally. Let
the config only be validated in it's original location.
Copy link
Contributor

@apparentlymart apparentlymart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a handful of inline comments but none of them are merge-blockers... some I think it best that we figure out before we make a release, but if we do decide to change things then we can do it in follow-up changes since this one is already pretty big.

One thing that wasn't entirely clear to me here is why the module storage key needs to include the source and version... is that just so that terraform init can understand whether there's a newer version available when asked to upgrade?

If we're going to be changing the filesystem layout of .terraform/modules anyway, I'd hoped we'd be able to change it to cleartext module paths to address the old GitHub issue (which I couldn't quickly find) that pointed out that it's hard to figure out which dir corresponds to which module when a path is included in an error message, or when you need to dive in there for some debugging:

.terraform/modules/foo/main.tf
.terraform/modules/bar/main.tf
.terraform/modules/bar.baz/main.tf

...but I guess in order to do that we'd need to still retain the source and version metadata out-of-band so that terraform init can understand what's already installed?

},
{ // too many parts
source: "registry.com/namespace/id/provider/extra",
notRegistry: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this style return an error? I can't think of what useful behavior this could have if we don't treat it as a registry URL, and so I think probably better to give the user the feedback that this isn't a valid source string. (Notwithstanding the special exceptions for github.com/etc handled elsewhere, of course.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, it actually does now, but notRegistry is checked first. I'll update the test.

Scheme: "https",
Host: module.RawHost.String(),
Path: defaultApiPath,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better here to return an error like "host does not provide a module registry" since that way we can catch this early using the discovery doc rather than downstream when we try to hit missing endpoints on a non-registry host (or a registry host that requires a different version, later) and thus give a better error message.

Since there's only one existing registry implementation, we should be able to get it updated to support the discovery protocol before this code ships.

I want to require all Terraform-native-service implementers to support the discovery protocol so that we can get the version-detection benefits from it in the future. (We do, of course, need to retain the idea that registry.terraform.io is the default host if none is specified, but the default registry should still support discovery.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's reasonable, I put this in here so that it would work immediately.

}
}

service := regURL.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that the rest of this code uses this to build relative URLs as strings. Perhaps it'd be more robust to retain the url.URL object here and then use ResolveReference and path.Join to build the concrete paths, rather than URL-naive string concatenation... should be functionally equivalent as long as we're doing all the necessary input validation on the things we insert into the URL, but if we aren't then it should help us avoid generating weird broken paths that may cause confusing behavior or vulnerabilities.

}

// the download location is in the X-Terraform-Get header
location := resp.Header.Get(xTerraformGet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the protocol defined in the RFC was that X-Terraform-Get here is just one possible response and that the server is also free to return a redirect or to just return the archive directly. I think the expectation was that this "download" URL would be passed directly to go-getter and then it would do all of the things it usually does to try to sniff out the appropriate behavior.

That does throw a bit of a wrench in the works for auth though, since I don't think we can add the necessary Authorization header to go-getter's request with today's interface, unless we were to do something interesting with a custom getter.

Maybe this is okay for now, but I'm nervous that this is more constrained than the protocol intended and we may grow to regret this if we need to change our implementation in future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I must have missed that. I thought the header was the only method for locating the resource at the moment.

}
}

func TestACCLookupModuleVersions(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of the ACC here? Normally I understand that to mean "acceptance test" (albeit capitalized as Acc) but this seems to be using the mock server and lacks a check for TF_ACC so I don't think that's what it means in this case...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, the mockRegistry was just some copy&paste leftovers. This does contact the terraform registry to lookup a module.

s := newModuleStorage(storage)
s := newModuleStorage(storage, mode)

children, err := t.getChildren(s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing here that t.getChildren and t.children both exist but refer to objects at a different stage of the lifecycle. I'm not sure what's better, though... maybe calling this t.findChildren helps slightly, since it makes it sound less like just an accessor method for t.children. 🤷‍♂️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about these sounding like accessors. I'll keep it in mind as I continue to clean up this file.

@jbardin
Copy link
Member Author

jbardin commented Oct 27, 2017

Thanks, I'll do a little more cleanup before merging here.

There's at least one test in there now that requires the source in the key, an the version is required for the same reason. For example, take module A with a module in a subdirectory called B. If a user were to update the source of A which also contained changes to its B directory, the version of B which is loaded would not change because the B's path and source would still be the same. The composition of many versioned modules makes this a much more common situation which I hit fairly quickly in my testing.

@jbardin jbardin merged commit d174b36 into 0.11-dev Oct 27, 2017
@jbardin jbardin deleted the jbardin/module-versions branch October 27, 2017 13:07
@tehmaspc
Copy link

Is there documentation around the version setting of a registry module?

This appears to work for me but I'd like to have some doc reference

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "1.3.0"
...
}

@ghost
Copy link

ghost commented Apr 6, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants