Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[create-doks-with-terraform-flux] Avoid querying DOKS cluster metadata in the main TF module, via the digitalocean_kubernetes_cluster data source #25

Open
v-ctiutiu opened this issue Jan 28, 2022 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@v-ctiutiu
Copy link
Contributor

Overview

Seems that this combination is behaving like a poison pill:

data "digitalocean_kubernetes_cluster" "primary" {
  name = var.doks_cluster_name
  depends_on = [
    digitalocean_kubernetes_cluster.primary
  ]
}

When used with the following provider:

provider "kubernetes" {
  host  = data.digitalocean_kubernetes_cluster.primary.endpoint
  token = data.digitalocean_kubernetes_cluster.primary.kube_config[0].token
  cluster_ca_certificate = base64decode(
    data.digitalocean_kubernetes_cluster.primary.kube_config[0].cluster_ca_certificate
  )
}

When you spin up a cluster for the first time, the above combination will work. But, subsequent runs of terraform plan fail with:

Error: Get "http://localhost/api/v1/namespaces/flux-system": dial tcp [::1]:80: connect: connection refused
│ 
│   with module.doks_flux_cd.kubernetes_namespace.flux_system,
│   on .terraform/modules/doks_flux_cd/create-doks-with-terraform-flux/main.tf line 52, in resource "kubernetes_namespace" "flux_system":
│   52: resource "kubernetes_namespace" "flux_system" {

My assumption is that it has to do on how Terraform evaluates resources, providers, data sources, etc. Seems that on subsequent runs, after the DOKS cluster is created, the depends_on condition is causing the digitalocean_kubernetes_cluster data source to not re-evaluate, or to not return valid data. The kubernetes provider will default to localhost instead, if not receiving a valid Kubernetes cluster configuration from the remote.

On the other hand, we don't need to lookup data using the digitalocean_kubernetes_cluster data source. The digitalocean_kubernetes_cluster resource, is already exposing everything we need after successful creation.

Proposed Solution

Avoid lookup using the digitalocean_kubernetes_cluster data source, and rely on the digitalocean_kubernetes_cluster resource instead.

@v-ctiutiu
Copy link
Contributor Author

We shall keep this open. Changing the node count works as expected now.

On the other hand, I am able to reproduce the issue again after the fix. This time when I change the cluster region, or pool size, same thing happens.

More than that, the digitalocean provider and Terraform should detect that the DOKS cluster must be recreated, but it doesn't. I tried every possible thing, like splitting the main configuration code into submodules, having the providers in separate modules, or inherit from the root module - still nothing !

I also followed the official kubernetes example from the digitalocean TF provider repo - the issue still reproduces.

Interesting though, if I use a random name for the cluster, it behaves as it should. But this seems like a workaround for me. Seems that some users are complaining about same issue as well on the official repo.

@ramwolken9
Copy link

ramwolken9 commented Apr 7, 2022

Hi @v-ctiutiu
I face the same kind of issue, Can you please help me fix this ?

module.doks_flux_cd.github_repository_file.install: Refreshing state... /clusters/do/development/flux-system/gotk-components.yaml
╷
│ Error: serializer for text/html; charset=utf-8 doesn't exist
│
│   with module.doks_flux_cd.kubernetes_namespace.flux_system,
│   on .terraform/modules/doks_flux_cd/create-doks-with-terraform-flux/main.tf line 52, in resource "kubernetes_namespace" "flux_system":
│   52: resource "kubernetes_namespace" "flux_system" {
│
╵

@ramwolken9
Copy link

I got this issue while terraform plan -out starter_kit_flux_cluster.out for updating doks_cluster_pool_size

@v-ctiutiu
Copy link
Contributor Author

Hi @ramwolken9,

I'm not sure it it's the same issue, but can you share some more details please? Like the Kubernetes version you're using, Terraform version, and maybe give some other relevant information or steps to help me reproduce the issue first ?

Are there any other moving parts in your setup ? What I want to know here is if you changed anything else in the TF module itself (like the Flux CD provider version). Or, did you change anything by hand in the Flux CD system configuration on the Kubernetes cluster ?

Thanks.

@ramwolken9
Copy link

@v-ctiutiu Thanks! You are correct, Issue was due to direct modification to cluster recourse from DOKS dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants