Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for waiting for default service account in ns resource #2119

Conversation

kschoche
Copy link
Contributor

@kschoche kschoche commented May 25, 2023

We are experiencing some flaky tests over in hashicorp/vault-secrets-operator in our project when running on low resource CI systems where we experience a race between the creation of a namespace resource and the ability to reference the default serviceaccount that Kubernetes creates for every namespace. The result is that the namespace is created by tf and dependent resources on the namespace (which also depend on the default serviceaccount) will fail to be created.

Some example code which we've seen hit this issue:

resource "kubernetes_namespace" "tenant-1" {
  metadata {
    name = var.k8s_test_namespace
  }
}

resource "kubernetes_secret" "default-sa-secret" {
  metadata {
    namespace = kubernetes_namespace.tenant-1.metadata[0].name
    name      = "default-sa-secret"
    annotations = {
      "kubernetes.io/service-account.name" = "default"
    }
  }
  type                           = "kubernetes.io/service-account-token"
  wait_for_service_account_token = true
}

This results in tests that fail like:

estVaultAuthMethods 2023-05-17T14:25:47Z retry.go:99: Returning due to fatal error: FatalError{Underlying: error while running command: exit status 1; ╷
│ Error: secrets "default-sa-secret" not found
│ 
│   with kubernetes_secret.default-sa,
│   on main.tf line 39, in resource "kubernetes_secret" "default-sa":
│   39: resource "kubernetes_secret" "default-sa" {
│ 
╵}
    apply.go:15: 
        	Error Trace:	/home/runner/go/pkg/mod/github.com/gruntwork-io/terratest@v0.41.19/modules/terraform/apply.go:15
        	            				/home/runner/work/vault-secrets-operator/vault-secrets-operator/test/integration/vaultauthmethods_integration_test.go:91
        	Error:      	Received unexpected error:
        	            	FatalError{Underlying: error while running command: exit status 1; ╷
        	            	│ Error: secrets "default-sa-secret" not found
        	            	│ 
        	            	│   with kubernetes_secret.default-sa,
        	            	│   on main.tf line 39, in resource "kubernetes_secret" "default-sa":
        	            	│   39: resource "kubernetes_secret" "default-sa" {
        	            	│ 
        	            	╵}
        	Test:       	TestVaultAuthMethods

The reason that it fails is that kubernetes will refuse to create a Secret which references a serviceaccount via annotation when the serviceaccount does not exist.

This is similar to #1943

Some workarounds which do work

  • Deploy your own namespace+serviceaccount+secret and not use the default serviceaccount.
  • Successfully used the kubernetes_default_service_account resource to provide the expected behaviour.

We've also attempted to use data to reference the default service account but it also hits the same issue:

https://github.com/hashicorp/vault-secrets-operator/actions/runs/5082362220/jobs/9132077465
│ Error: Unable to fetch service account from Kubernetes: serviceaccounts "default" not found
│ 
│   with data.kubernetes_service_account.default,
│   on variables.tf line 4, in data "kubernetes_service_account" "default":
│    4: data "kubernetes_service_account" "default" {
│ 
╵}
    apply.go:15: 
        	Error Trace:	/home/runner/go/pkg/mod/github.com/gruntwork-io/terratest@v0.41.25/modules/terraform/apply.go:15
        	            				/home/runner/work/vault-secrets-operator/vault-secrets-operator/test/integration/vaultauthmethods_integration_test.go:93
        	Error:      	Received unexpected error:
        	            	FatalError{Underlying: error while running command: exit status 1; ╷
        	            	│ Error: Unable to fetch service account from Kubernetes: serviceaccounts "default" not found
        	            	│ 
        	            	│   with data.kubernetes_service_account.default,
        	            	│   on variables.tf line 4, in data "kubernetes_service_account" "default":
        	            	│    4: data "kubernetes_service_account" "default" {
        	            	│ 
        	            	╵}

This PR adds the optional ability to wait for the default service account to be created by Kubernetes when creating a namespace resource:

resource "kubernetes_namespace" "test" {
  metadata {
    name = "%s"
  }
  wait_for_default_service_account = true
}

Description

Acceptance tests

  • Have you added an acceptance test for the functionality being added?
  • Have you run the acceptance tests on this branch?

Output from acceptance testing:

demo $ KUBE_CONFIG_PATH=~/.kube/config make testacc TESTARGS='-run=TestAccKubernetesNamespace'
==> Checking that code complies with gofmt requirements...
go vet .
TF_ACC=1 go test "/Users/kyle/go/src/github.com/kschoche/terraform-provider-kubernetes/kubernetes" -v -run=TestAccKubernetesNamespace -timeout 3h
=== RUN   TestAccKubernetesNamespace_basic
--- PASS: TestAccKubernetesNamespace_basic (13.60s)
=== RUN   TestAccKubernetesNamespace_default_service_account
--- PASS: TestAccKubernetesNamespace_default_service_account (8.78s)
=== RUN   TestAccKubernetesNamespace_generatedName
--- PASS: TestAccKubernetesNamespace_generatedName (8.18s)
=== RUN   TestAccKubernetesNamespace_withSpecialCharacters
--- PASS: TestAccKubernetesNamespace_withSpecialCharacters (7.91s)
=== RUN   TestAccKubernetesNamespace_deleteTimeout
--- PASS: TestAccKubernetesNamespace_deleteTimeout (7.91s)
PASS
ok      github.com/hashicorp/terraform-provider-kubernetes/kubernetes   47.057s
...

Release Note

Release note for CHANGELOG:

Adds `wait_for_default_service_account` to namespace resource which will wait for the `default` kubernetes serviceaccount to be created by Kubernetes.

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Copy link
Collaborator

@jrhouston jrhouston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left you 1 comment but overall this makes sense – you'll need to add a line to the documentation page for kubernetes_namespace resource with your new attribute.

Comment on lines 67 to 71
// FIXME: Since the creation of the Default ServiceAccount is asynchronous to the Namespace
// We need to wait some period of time, using something like d.Timeout(schema.TimeoutCreate)
// is not long enough for a Kind cluster to pass without timing out (about 10 seconds).
// Using an undefined word here forces the timeout to use the defaultTimeout of 20mins.
// I do not consider this production ready.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the logic here. The schema for the kubernetes_namespace resource doesn't specify a create timeout, so if you use schema.TimeoutCreate you'll get the default of 20 minutes anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I re-tested it and it does work as you mentioned, so this must've been a mistake on my end.
Thanks!

@kschoche kschoche marked this pull request as ready for review June 6, 2023 20:56
@kschoche kschoche requested a review from a team as a code owner June 6, 2023 20:56
@jrhouston jrhouston merged commit 5bc27bc into hashicorp:main Jun 7, 2023
@kschoche kschoche deleted the VAULT-16125/add_wait_for_default_service_account_to_namespace_resource branch June 7, 2023 18:15
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants