Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - AWS kubernetes resources not fully deleting properly (security group created by eks) #1110

Open
costrouc opened this issue Feb 23, 2022 · 5 comments
Labels
area: tech-debt ⛓️ Items related to paying down tech debt help wanted Extra attention is needed impact: medium 🟨 This item affects some users, not critical provider: AWS type: bug 🐛 Something isn't working

Comments

@costrouc
Copy link
Member

costrouc commented Feb 23, 2022

OS system and architecture in which you are running QHub

Linux

Expected behavior

All qhub resources should cleanly delete.

[terraform]: │ Error: Plugin error
[terraform]: │ 
[terraform]: │   with module.kubernetes-jupyterhub-ssh.kubernetes_manifest.jupyterhub-sftp-ingress,
[terraform]: │   on modules/kubernetes/services/jupyterhub-ssh/main.tf line 27, in resource "kubernetes_manifest" "jupyterhub-sftp-ingress":
[terraform]: │   27: resource "kubernetes_manifest" "jupyterhub-sftp-ingress" {
[terraform]: │ 
[terraform]: │ The plugin returned an unexpected error from
[terraform]: │ plugin.(*GRPCProvider).ReadResource: rpc error: code = Unknown desc =
[terraform]: │ Unauthorized
[terraform]: ╵
[terraform]: ╷
[terraform]: │ Error: Plugin error
[terraform]: │ 
[terraform]: │   with module.jupyterhub.kubernetes_manifest.jupyterhub,
[terraform]: │   on modules/kubernetes/services/jupyterhub/main.tf line 126, in resource "kubernetes_manifest" "jupyterhub":
[terraform]: │  126: resource "kubernetes_manifest" "jupyterhub" {
[terraform]: │ 
[terraform]: │ The plugin returned an unexpected error from
[terraform]: │ plugin.(*GRPCProvider).ReadResource: rpc error: code = Unknown desc =
[terraform]: │ Unauthorized
[terraform]: ╵
[terraform]: ╷
[terraform]: │ Error: Plugin error
[terraform]: │ 
[terraform]: │   with module.kubernetes-conda-store-server.module.minio.kubernetes_manifest.minio-api,
[terraform]: │   on modules/kubernetes/services/minio/ingress.tf line 1, in resource "kubernetes_manifest" "minio-api":
[terraform]: │    1: resource "kubernetes_manifest" "minio-api" {
[terraform]: │ 
[terraform]: │ The plugin returned an unexpected error from
[terraform]: │ plugin.(*GRPCProvider).ReadResource: rpc error: code = Unknown desc =
[terraform]: │ Unauthorized
[terraform]: ╵
[terraform]: ╷
[terraform]: │ Error: Plugin error
[terraform]: │ 
[terraform]: │   with module.monitoring[0].kubernetes_manifest.grafana-ingress-route,
[terraform]: │   on modules/kubernetes/services/monitoring/main.tf line 122, in resource "kubernetes_manifest" "grafana-ingress-route":
[terraform]: │  122: resource "kubernetes_manifest" "grafana-ingress-route" {
[terraform]: │ 
[terraform]: │ The plugin returned an unexpected error from
[terraform]: │ plugin.(*GRPCProvider).ReadResource: rpc error: code = Unknown desc =
[terraform]: │ Unauthorized
[terraform]: ╵
INFO:qhub.provider.terraform:terraform init directory=stages/06-kubernetes-keycloak-configuration
INFO:qhub.provider.terraform: terraform at /tmp/terraform/1.0.5/terraform
[terraform]: 

See https://github.com/Quansight/qhub-integration-test/runs/5311056863?check_suite_focus=true#step:6:1250 for example. This is not needed for 0.4.0. But should be resolved in 0.4.1

[terraform]: module.network.aws_vpc.main: Still destroying... [id=vpc-0d4af13fa907ed7bf, 4m20s elapsed]
[terraform]: module.network.aws_vpc.main: Still destroying... [id=vpc-0d4af13fa907ed7bf, 4m30s elapsed]
[terraform]: module.network.aws_vpc.main: Still destroying... [id=vpc-0d4af13fa907ed7bf, 4m40s elapsed]
[terraform]: module.network.aws_vpc.main: Still destroying... [id=vpc-0d4af13fa907ed7bf, 4m50s elapsed]
[terraform]: ╷
[terraform]: │ Error: error deleting EC2 VPC (vpc-0d4af13fa907ed7bf): DependencyViolation: The vpc 'vpc-0d4af13fa907ed7bf' has dependencies and cannot be deleted.
[terraform]: │ 	status code: 400, request id: 467e3035-bbc1-400e-8880-a766392f1a9e
[terraform]: │ 
[terraform]: │ 
[terraform]: ╵
INFO:qhub.provider.terraform:terraform init directory=stages/01-terraform-state/aws

Actual behavior

Resources do not all properly delete

How to Reproduce the problem?

Run qhub-integration-tests

Command output

No response

Versions and dependencies used.

No response

Compute environment

No response

Integrations

No response

Anything else?

No response

@costrouc costrouc added the type: bug 🐛 Something isn't working label Feb 23, 2022
@costrouc costrouc added this to the Future Release v0.4.x milestone Feb 23, 2022
@costrouc
Copy link
Member Author

I'm going to push this issue into 0.4.1 or later. I'll explain the rational. Currently the aws vpc does not cleanly delete with qhub destroy. There are two reasons for this.

So this issue is a pain with no great solution on how to properly cleanup without AWS fixing this issue. Realistically this should not cause any problems aside from a stray vpc existing (no additional cost). If you want to delete the vpc simply go to the console and delete the vpc it should delete with it saying warning there is a security group still attached.

@costrouc costrouc changed the title [BUG] - AWS kubernetes resources not fully deleting properly [BUG] - AWS kubernetes resources not fully deleting properly (security group created by eks) Feb 24, 2022
@magsol
Copy link
Contributor

magsol commented Feb 24, 2022 via email

@viniciusdc
Copy link
Contributor

viniciusdc commented May 11, 2022

Hi @costrouc I haven't found this recently, but how odd it would be if we add an extra removal step during destroying to use boto to check if the most painful resources were deleted?

  • EKS Lb
  • Elastic filesystem
  • S3 buckets (which can be deleted very cleanly using the python cli)
  • EKS clusters and VPC -- (deleting the vpcs seems to also remove the Security groups)

@webdog
Copy link

webdog commented May 13, 2022

@costrouc @viniciusdc 👋 I found this thread by your link-back to the terraform-aws module issue I opened.

You might want to have a look at this terraform mini module I released awhile back and have been using internally for a couple months. During the terraform destroy, the module removes these Load Balancers that are stuck because of stray ENIs (Which creates the block in deleting subnets and security groups): https://github.com/webdog/terraform-kubernetes-delete-eni

At minimum, the shell script can be taken from the module, if the terraform module doesn't make sense to use. Cheers!

@aktech
Copy link
Member

aktech commented Feb 8, 2024

@dcmcand dcmcand added the area: tech-debt ⛓️ Items related to paying down tech debt label Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: tech-debt ⛓️ Items related to paying down tech debt help wanted Extra attention is needed impact: medium 🟨 This item affects some users, not critical provider: AWS type: bug 🐛 Something isn't working
Projects
Status: New 🚦
Development

No branches or pull requests

8 participants