Skip to content
This repository has been archived by the owner on Jun 4, 2024. It is now read-only.

Adds RFD-0002 - Custom Terraform Registry #444

Merged
merged 29 commits into from
May 3, 2022
Merged

Adds RFD-0002 - Custom Terraform Registry #444

merged 29 commits into from
May 3, 2022

Conversation

tcsc
Copy link
Contributor

@tcsc tcsc commented Mar 7, 2022

The task of supporting the creation of a custom Terraform registry has snowballed somewhat, and so (at @r0mant's suggestion) I've put together a small RFD to make sure that everyone is on the same page.

This is very definitely a first pass, and the discussion part of the RFD process is highly welcomed.

This contributes to #235.

@tcsc tcsc requested a review from fheinecke March 7, 2022 06:15
Copy link
Contributor

@wadells wadells left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together!

Overall, I'd lean towards "lets not run an extra service or code in next". Can we just generate the metadata at promotion time? However I also recognize this path has been paved with tears for apt. It really comes down to "What does the hashicorp TF plugin index file look like, and how do clients (terraform) consume that?". If the file is a dumb index (e.g. no "latest", an unordered list), and clients do the version selection, we're probably ok. If clients implicitly rely on e.g. the first item in a list or some sort of latest metadata, we may have more trouble.

@tcsc
Copy link
Contributor Author

tcsc commented Mar 7, 2022

Can we just generate the metadata at promotion time?

Initially I thought the registry needed to be smart, but on reflection there is no actual reason why this should be so. It should be possible to generate an index and all of the version entries at promotion time, arrange them in an appropriate directpory tree and serve them statically.

This is nice as it means deployment reverts to being s3 sync at promotion time.

Will work this idea up and see if it comes to anything.

@tcsc tcsc requested a review from reedloden March 8, 2022 00:43
@benarent
Copy link
Contributor

benarent commented Mar 8, 2022

As part of the registry can I request that we also provide beta builds? I automate a lot of my cluster creation, and it would be awesome to have the latest teleport terraform provider for setup.

@wadells
Copy link
Contributor

wadells commented Mar 8, 2022

As part of the registry can I request that we also provide beta builds? I automate a lot of my cluster creation, and it would be awesome to have the latest teleport terraform provider for setup.

My 2c:

The general approach I'm pushing for promotion is:

  1. every merge to main or branches/* or tag event publishes artifacts to an internal bucket/registry which isn't open to the public. As an employee, we can make sure you have the right IAM access to read from this (after okta 2fa).
  2. during the promotion step, we copy bits from the internal bucket to the customer facing bucket, where no auth is needed

This isn't exactly intended for the "beta" users workflow. In my mind it is to allow us to do build validation without actually publishing and generally "shift left" for build failures. That said, it would allow internal folks a not-that-friendly-but-better-than-buidling-it-yourself way to get the latest dev builds with aws s3 cp ...

If we wanted to make pre-releases (alpha/beta/rc) available to customers or in a friendly way, we should promote them. But we need to make sure a TF registry does sane things with pre-releases before we do that.

@tcsc
Copy link
Contributor Author

tcsc commented Mar 9, 2022

Can we just generate the metadata at promotion time?

@wadells - I'm now 98% sure we can do this. I have an proof-of-concept where I can generate the metadata in a batch and serve it statically from an S3 bucket via cloudfront.

That last 2% uncertainty is only because I haven't got terraform to fully, 100% install the provider from the registry yet, but that looks like a signing key problem and not a file hosting problem:

Example terraform

terraform {
    required_providers {
      teleport = {
        source  = "terraform.clarke.mobi/gravitational/teleport"
        version = "~> 8.3"
      }
  }
}

Example terraform output:

> TF_LOG=DEBUG terraform init
 # ... skipped irrelevant debug logging... 
Initializing provider plugins...
- Finding terraform.clarke.mobi/gravitational/teleport versions matching "~> 8.3"...
2022-03-09T17:02:30.852+1100 [DEBUG] Service discovery for terraform.clarke.mobi at https://terraform.clarke.mobi/.well-known/terraform.json
2022-03-09T17:02:31.446+1100 [DEBUG] GET https://terraform.clarke.mobi/index/gravitational/teleport/versions
2022-03-09T17:02:32.708+1100 [DEBUG] GET https://terraform.clarke.mobi/index/gravitational/teleport/8.3.4/download/linux/amd64
2022-03-09T17:02:33.918+1100 [DEBUG] GET https://terraform.clarke.mobi/store/terraform-provider-teleport-v8.3.4-linux-amd64-bin.zip.sums
2022-03-09T17:02:34.773+1100 [DEBUG] GET https://terraform.clarke.mobi/store/terraform-provider-teleport-v8.3.4-linux-amd64-bin.zip.sums.sig
- Installing terraform.clarke.mobi/gravitational/teleport v8.3.4...
╷
│ Error: Failed to install provider
│ 
│ Error while installing terraform.clarke.mobi/gravitational/teleport v8.3.4: error decoding signing key: openpgp: invalid data: entity without any identities
╵

@reedloden
Copy link
Contributor

Are we sure that private terraform registries will work for our customers?

I see this on https://www.terraform.io/cloud-docs/registry/using

Note: Terraform Enterprise does not support private providers.

I would hate for us to do a bunch of work and then find out that our customers can't even use it.

@tcsc
Copy link
Contributor Author

tcsc commented Mar 9, 2022

Note: Terraform Enterprise does not support private providers.

It's unfortunate that private providers is an overloaded term in the Terraform docs. It could mean:

  1. A provider that is distributed anywhere other than the public Hashicorp registry
  2. A provider that is distributed from a Terraform Cloud Private Registry
  3. A provider that is only available on a private network via a Custom Registry

I think, from reading the Private Registry Docs, that the note above refers to option (2), as a Terraform Cloud Private Registry requires client authentication that the Terraform Enterprise service will not have access to.

That said; this is probably cheap to confirm and expensive to get wrong, so it definitely needs checking out.

EDIT: I've uncovered a further complication - what does Terraform Enterprise mean in this context ?

Before mid-2019, all distributions of Terraform Cloud used to be called Terraform Enterprise; the self-hosted distribution was called Private Terraform Enterprise (PTFE). These previous names sometimes still appear in supporting tools (like the tfe Terraform provider, which is also intended for use with Terraform Cloud).

Maybe it's just best to try it and see.

@wadells wadells requested a review from kbence March 9, 2022 16:48
@tcsc
Copy link
Contributor Author

tcsc commented Mar 10, 2022

Ha!

Running locally:

❯ terraform init

Initializing the backend...

Initializing provider plugins...
- Finding terraform.clarke.mobi/gravitational/teleport versions matching "~> 8.3"...
- Installing terraform.clarke.mobi/gravitational/teleport v8.3.4...
- Installed terraform.clarke.mobi/gravitational/teleport v8.3.4 (self-signed, key ID 7DA6C64E1701F9E4)

Running on Terraform Cloud:

Error: all connection methods failed identity file could not be decoded: open terraform-identity: no such file or directory
with provider["terraform.clarke.mobi/gravitational/teleport"]
on main.tf line 10, in provider "teleport":

provider "teleport" {

... which means that, even though the actual apply errored out, the TF script got far enough to actually start planning things - meaning that Terraform Cloud was able to locate and install the Teleport provider from my PoC custom registry.

Will update RFD to match.

@tcsc tcsc requested a review from wadells March 28, 2022 11:17
Comment on lines +275 to +278
In order to correctly index the providers, the registry need to know what
version of the Terraform Plugin API a given provider supports. This is not
something we can deduce from the provider itself, and must be "just known"
in advance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add that those API version upgrades are really rare and always backwards-compatible, as far as I understood. We even might want to stick with an older version for a while when this upgrade happens, to make things work with older Terraform versions.

The Download-Update-Publish cycle for updating the registry index is vulnerable
to corruption if multiple independent updates occur simultaneously.

Unfortunately, AWS S3 does not provide any built-in methods for preventing this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but as far as I remember, S3 has simple object locks with/without timeouts. For sure, they can't replace CAS, but might be helpful to fail fast and retry if an another conflicting build is in progress.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked in to AWS locks, and I don't think they're suitable for our purposes. Will take a deeper look and see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, S3 object locking is definitely not suitable for this. Best case we can do is hand-roll a manual locking mechanism using a DynamoDB table or similar. Will that be better than Drone's synchronization? I have no idea.

* ... something else entirely?
2) the provider namespace; options are `gravitational` or `teleport`,
3) the provider name; the obvious choice is `teleport`, but mentioned for
completeness
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for the shorter domain (easier to remember), and "gravitational" namespace in case we decide to add a provider for something other than Teleport itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have helm charts and other assets are already on teleport.dev (.dev is operated by google).
https://charts.releases.teleport.dev/[](https://charts.releases.teleport.dev/)

Why not teleport.dev/terraform
Either way, it can be goteleport.com, but in this case we should move helm charts and other assets to goteleport.com.I think all our assets should be distributed from one domain.
Do you agree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately teleport.dev/terraform wouldn't match the requirements of the Terraform discovery protocol, but we could certainly do something like terraform.teleport.dev/gravitational/teleport

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd encourage us to stick with the same terraform.releases.teleport.dev style as we have for the current repos for the sake of consistency. The full name terraform.releases.teleport.dev/gravitational/teleport is long and a little redundant, but I think that for something that users are going to copy/paste once and use for a longer period, it's not going to be a huge issue.

* ... something else entirely?
2) the provider namespace; options are `gravitational` or `teleport`,
3) the provider name; the obvious choice is `teleport`, but mentioned for
completeness
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have helm charts and other assets are already on teleport.dev (.dev is operated by google).
https://charts.releases.teleport.dev/[](https://charts.releases.teleport.dev/)

Why not teleport.dev/terraform
Either way, it can be goteleport.com, but in this case we should move helm charts and other assets to goteleport.com.I think all our assets should be distributed from one domain.
Do you agree?

* ... something else entirely?
2) the provider namespace; options are `gravitational` or `teleport`,
3) the provider name; the obvious choice is `teleport`, but mentioned for
completeness
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd encourage us to stick with the same terraform.releases.teleport.dev style as we have for the current repos for the sake of consistency. The full name terraform.releases.teleport.dev/gravitational/teleport is long and a little redundant, but I think that for something that users are going to copy/paste once and use for a longer period, it's not going to be a huge issue.

tcsc and others added 4 commits April 7, 2022 11:17
Co-authored-by: Russell Jones <russjones@users.noreply.github.com>
Co-authored-by: Gus Luxton <gus@gravitational.com>
Co-authored-by: Gus Luxton <gus@gravitational.com>

In short: which one of these do we want people to use?

1. `terraform.releases.teleport.dev/gravitational/teleport`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one.

@r0mant
Copy link
Contributor

r0mant commented Apr 27, 2022

@tcsc Can you bring this up to date and merge please?

@tcsc tcsc requested a review from zmb3 as a code owner May 2, 2022 11:17
@tcsc tcsc merged commit 683c045 into master May 3, 2022
@tcsc tcsc deleted the tcsc/rfd-2 branch May 3, 2022 07:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.