Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Terraform errors out with connection to localhost. Connection refused (possibly when changing general node-group size) #1124

Closed
MaxTechniche opened this issue Feb 26, 2022 · 2 comments · Fixed by #1246
Assignees
Labels
needs: follow-up 📫 Someone needs to get back to this issue or PR needs: investigation 🔍 Someone in the team needs to find the root cause and replicate this bug provider: Digital Ocean type: bug 🐛 Something isn't working

Comments

@MaxTechniche
Copy link

MaxTechniche commented Feb 26, 2022

OS system and architecture in which you are running QHub

Linux-ubuntu (Digitalocean)

Expected behavior

New node-group size should be deployed successfully.

Actual behavior

Terrafrom returnsError: Get "http://localhost/api/v1/namespaces/dev": dial tcp [::1]:80: connect: connection refused then exits out of CICD redeployment.

How to Reproduce the problem?

It happened when we attempted to update the kubernetes droplet/node-group size using qhub-config.yaml

Command output

<detail>
Running with gitlab-runner 14.8.0~beta.44.g57df0d52 (57df0d52)
  on blue-5.shared.runners-manager.gitlab.com/de
fault -AzERasQ
Resolving secrets 00:00
Preparing the "docker+machine" executor 00:32
Using Docker executor with image python:3.9 ...
Pulling docker image python:3.9 ...
Using docker image sha256:4819be0df94257e1e31cd64dda12d46ff8b2180f9576ad9eaf98dcac9d70d9f9 for python:3.9 with digest python@sha256:e8f55f9674b1e0a6eb7fba009e66169ffeaea9918fb8ecf635b158d5ed382ac6 ...
Preparing environment 00:03
Running on runner--azerasq-project-33526645-concurrent-0 via runner-azerasq-shared-1645818727-2b4b4bd9...
Getting source from Git repository 00:02
$ eval "$CI_PRE_CLONE_SCRIPT"
Fetching changes with git depth set to 20...
teams/automation-and-integration/openteams_qhub/.git/
Created fresh repository.
Checking out 65a877f2 as main...
Skipping Git submodules setup
Executing "step_script" stage of the job script 00:38
Using docker image sha256:4819be0df94257e1e31cd64dda12d46ff8b2180f9576ad9eaf98dcac9d70d9f9 for python:3.9 with digest python@sha256:e8f55f9674b1e0a6eb7fba009e66169ffeaea9918fb8ecf635b158d5ed382ac6 ...
$ git remote set-url origin https://gitlab-ci-token:${CI_PUSH_TOKEN}@gitlab.com/openteams/automation-and-integration/openteams_qhub.git
$ git config --global user.email 'qhub@quansight.com'
$ git config --global user.name 'github action'
$ git checkout "main"
Switched to a new branch 'main'
Branch 'main' set up to track remote branch 'main' from 'origin'.
$ pip install qhub==0.3.14
Collecting qhub==0.3.14
  Downloading qhub-0.3.14.tar.gz (243 kB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Collecting azure-identity==1.6.1
  Downloading azure_identity-1.6.1-py2.py3-none-any.whl (109 kB)
Collecting bcrypt
  Downloading bcrypt-3.2.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (61 kB)
Collecting azure-mgmt-containerservice==16.2.0
  Downloading azure_mgmt_containerservice-16.2.0-py2.py3-none-any.whl (1.8 MB)
Collecting pydantic
  Downloading pydantic-1.9.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
Collecting cookiecutter==1.7.2
  Downloading cookiecutter-1.7.2-py2.py3-none-any.whl (34 kB)
Collecting ruamel.yaml
  Downloading ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
Collecting pynacl
  Downloading PyNaCl-1.5.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (856 kB)
Collecting cloudflare
  Downloading cloudflare-2.8.15.tar.gz (70 kB)
Collecting auth0-python
  Downloading auth0_python-3.20.0-py2.py3-none-any.whl (111 kB)
Collecting gitignore-parser==0.0.8
  Downloading gitignore_parser-0.0.8.tar.gz (4.0 kB)
Collecting azure-core<2.0.0,>=1.0.0
  Downloading azure_core-1.22.1-py3-none-any.whl (178 kB)
Collecting msal-extensions~=0.3.0
  Downloading msal_extensions-0.3.1-py2.py3-none-any.whl (18 kB)
Collecting msal<2.0.0,>=1.7.0
  Downloading msal-1.17.0-py2.py3-none-any.whl (79 kB)
Collecting six>=1.12.0
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting cryptography>=2.1.4
  Downloading cryptography-36.0.1-cp36-abi3-manylinux_2_24_x86_64.whl (3.6 MB)
Collecting azure-mgmt-core<2.0.0,>=1.2.0
  Downloading azure_mgmt_core-1.3.0-py2.py3-none-any.whl (25 kB)
Collecting msrest>=0.6.21
  Downloading msrest-0.6.21-py2.py3-none-any.whl (85 kB)
Collecting azure-common~=1.1
  Downloading azure_common-1.1.28-py2.py3-none-any.whl (14 kB)
Collecting jinja2-time>=0.2.0
  Downloading jinja2_time-0.2.0-py2.py3-none-any.whl (6.4 kB)
Collecting requests>=2.23.0
  Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB)
Collecting click>=7.0
  Downloading click-8.0.4-py3-none-any.whl (97 kB)
Collecting poyo>=0.5.0
  Downloading poyo-0.5.0-py2.py3-none-any.whl (10 kB)
Collecting MarkupSafe<2.0.0
  Downloading MarkupSafe-1.1.1-cp39-cp39-manylinux2010_x86_64.whl (32 kB)
Collecting Jinja2<3.0.0
  Downloading Jinja2-2.11.3-py2.py3-none-any.whl (125 kB)
Collecting binaryornot>=0.4.4
  Downloading binaryornot-0.4.4-py2.py3-none-any.whl (9.0 kB)
Collecting python-slugify>=4.0.0
  Downloading python_slugify-6.1.0-py2.py3-none-any.whl (9.2 kB)
Collecting chardet>=3.0.2
  Downloading chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Collecting cffi>=1.12
  Downloading cffi-1.15.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (444 kB)
Collecting pycparser
  Downloading pycparser-2.21-py2.py3-none-any.whl (118 kB)
Collecting arrow
  Downloading arrow-1.2.2-py3-none-any.whl (64 kB)
Collecting PyJWT[crypto]<3,>=1.0.0
  Downloading PyJWT-2.3.0-py3-none-any.whl (16 kB)
Collecting portalocker<3,>=1.0
  Downloading portalocker-2.4.0-py2.py3-none-any.whl (16 kB)
Collecting certifi>=2017.4.17
  Downloading certifi-2021.10.8-py2.py3-none-any.whl (149 kB)
Collecting requests-oauthlib>=0.5.0
  Downloading requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
Collecting isodate>=0.6.0
  Downloading isodate-0.6.1-py2.py3-none-any.whl (41 kB)
Collecting text-unidecode>=1.3
  Downloading text_unidecode-1.3-py2.py3-none-any.whl (78 kB)
Collecting urllib3<1.27,>=1.21.1
  Downloading urllib3-1.26.8-py2.py3-none-any.whl (138 kB)
Collecting charset-normalizer~=2.0.0
  Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
Collecting idna<4,>=2.5
  Downloading idna-3.3-py3-none-any.whl (61 kB)
Collecting oauthlib>=3.0.0
  Downloading oauthlib-3.2.0-py3-none-any.whl (151 kB)
Collecting python-dateutil>=2.7.0
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting pyyaml
  Downloading PyYAML-6.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (661 kB)
Collecting jsonlines
  Downloading jsonlines-3.0.0-py3-none-any.whl (8.5 kB)
Collecting beautifulsoup4
  Downloading beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)
Collecting soupsieve>1.2
  Downloading soupsieve-2.3.1-py3-none-any.whl (37 kB)
Collecting attrs>=19.2.0
  Downloading attrs-21.4.0-py2.py3-none-any.whl (60 kB)
Collecting typing-extensions>=3.7.4.3
  Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)
Collecting ruamel.yaml.clib>=0.2.6
  Downloading ruamel.yaml.clib-0.2.6-cp39-cp39-manylinux1_x86_64.whl (539 kB)
Building wheels for collected packages: qhub, gitignore-parser, cloudflare
  Building wheel for qhub (PEP 517): started
  Building wheel for qhub (PEP 517): finished with status 'done'
  Created wheel for qhub: filename=qhub-0.3.14-py3-none-any.whl size=322903 sha256=02684007f54ccf9f48a2162d477a8e35eaf8b33b084cf35389ff1a7ead7abce2
  Stored in directory: /root/.cache/pip/wheels/0f/ea/ae/ba9f95452715d7ba88409429ef6e10728638cc1d58d12856da
  Building wheel for gitignore-parser (setup.py): started
  Building wheel for gitignore-parser (setup.py): finished with status 'done'
  Created wheel for gitignore-parser: filename=gitignore_parser-0.0.8-py3-none-any.whl size=3857 sha256=1ab297a2a77c2e53d8ff4d87e9d3ef349a5ab9f1f7a614eae05682b1a311ff02
  Stored in directory: /root/.cache/pip/wheels/d8/85/8a/e24e8605bc7b09937a840dd4556fe7591894b9214483ab1fd7
  Building wheel for cloudflare (setup.py): started
  Building wheel for cloudflare (setup.py): finished with status 'done'
  Created wheel for cloudflare: filename=cloudflare-2.8.15-py3-none-any.whl size=59313 sha256=9e7ad58b63266c2997615d66c5f03f2b6c67dd304cb09d9f5460a85d248a102b
  Stored in directory: /root/.cache/pip/wheels/fb/58/1f/d7aee9644b6b695afd58f30ecad68110d7e9315b64578cd3f2
Successfully built qhub gitignore-parser cloudflare
Installing collected packages: pycparser, cffi, urllib3, six, PyJWT, idna, cryptography, charset-normalizer, certifi, requests, python-dateutil, oauthlib, MarkupSafe, text-unidecode, soupsieve, requests-oauthlib, portalocker, msal, Jinja2, isodate, chardet, azure-core, attrs, arrow, typing-extensions, ruamel.yaml.clib, pyyaml, python-slugify, poyo, msrest, msal-extensions, jsonlines, jinja2-time, click, binaryornot, beautifulsoup4, azure-mgmt-core, azure-common, ruamel.yaml, pynacl, pydantic, gitignore-parser, cookiecutter, cloudflare, bcrypt, azure-mgmt-containerservice, azure-identity, auth0-python, qhub
Successfully installed Jinja2-2.11.3 MarkupSafe-1.1.1 PyJWT-2.3.0 arrow-1.2.2 attrs-21.4.0 auth0-python-3.20.0 azure-common-1.1.28 azure-core-1.22.1 azure-identity-1.6.1 azure-mgmt-containerservice-16.2.0 azure-mgmt-core-1.3.0 bcrypt-3.2.0 beautifulsoup4-4.10.0 binaryornot-0.4.4 certifi-2021.10.8 cffi-1.15.0 chardet-4.0.0 charset-normalizer-2.0.12 click-8.0.4 cloudflare-2.8.15 cookiecutter-1.7.2 cryptography-36.0.1 gitignore-parser-0.0.8 idna-3.3 isodate-0.6.1 jinja2-time-0.2.0 jsonlines-3.0.0 msal-1.17.0 msal-extensions-0.3.1 msrest-0.6.21 oauthlib-3.2.0 portalocker-2.4.0 poyo-0.5.0 pycparser-2.21 pydantic-1.9.0 pynacl-1.5.0 python-dateutil-2.8.2 python-slugify-6.1.0 pyyaml-6.0 qhub-0.3.14 requests-2.27.1 requests-oauthlib-1.3.1 ruamel.yaml-0.17.21 ruamel.yaml.clib-0.2.6 six-1.16.0 soupsieve-2.3.1 text-unidecode-1.3 typing-extensions-4.1.1 urllib3-1.26.8
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 22.0.3 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
$ qhub deploy --config qhub-config.yaml --disable-prompt --skip-remote-state-provision
INFO:qhub.deploy:All qhub endpoints will be under https://qhub.openteams.com
INFO:qhub.provider.terraform:terraform init directory=infrastructure
INFO:qhub.provider.terraform:downloading and extracting terraform binary from url=https://releases.hashicorp.com/terraform/1.0.5/terraform_1.0.5_linux_amd64.zip to path=/tmp/terraform/1.0.5/terraform
INFO:qhub.provider.terraform: terraform at /tmp/terraform/1.0.5/terraform
[terraform]: Initializing modules...
[terraform]: - forwardauth in modules/kubernetes/forwardauth
[terraform]: - kubernetes in modules/digitalocean/kubernetes
[terraform]: - kubernetes-conda-store-mount in modules/kubernetes/nfs-mount
[terraform]: - kubernetes-conda-store-server in modules/kubernetes/services/conda-store
[terraform]: - kubernetes-ingress in modules/kubernetes/ingress
[terraform]: - kubernetes-initialization in modules/kubernetes/initialization
[terraform]: - kubernetes-nfs-mount in modules/kubernetes/nfs-mount
[terraform]: - kubernetes-nfs-server in modules/kubernetes/nfs-server
[terraform]: - monitoring in modules/kubernetes/services/monitoring
[terraform]: - qhub in modules/kubernetes/services/meta/qhub
[terraform]: - qhub.external-container-reg in modules/kubernetes/services/extcr
[terraform]: - qhub.kubernetes-dask-gateway in modules/kubernetes/services/dask-gateway
[terraform]: - qhub.kubernetes-jupyterhub in modules/kubernetes/services/jupyterhub
[terraform]: - qhub.kubernetes-jupyterhub-ssh in modules/kubernetes/services/jupyterhub-ssh
[terraform]: 
[terraform]: Initializing the backend...
[terraform]: 
[terraform]: Successfully configured the backend "s3"! Terraform will automatically
[terraform]: use this backend unless the backend configuration changes.
[terraform]: 
[terraform]: Initializing provider plugins...
[terraform]: - Finding hashicorp/kubernetes versions matching "2.3.2"...
[terraform]: - Finding hashicorp/kubernetes-alpha versions matching "0.3.2"...
[terraform]: - Finding digitalocean/digitalocean versions matching "2.14.0"...
[terraform]: - Finding latest version of hashicorp/random...
[terraform]: - Finding latest version of hashicorp/tls...
[terraform]: - Finding hashicorp/helm versions matching "2.1.2"...
[terraform]: - Installing hashicorp/helm v2.1.2...
[terraform]: - Installed hashicorp/helm v2.1.2 (signed by HashiCorp)
[terraform]: - Installing hashicorp/kubernetes v2.3.2...
[terraform]: - Installed hashicorp/kubernetes v2.3.2 (signed by HashiCorp)
[terraform]: - Installing hashicorp/kubernetes-alpha v0.3.2...
[terraform]: - Installed hashicorp/kubernetes-alpha v0.3.2 (signed by HashiCorp)
[terraform]: - Installing digitalocean/digitalocean v2.14.0...
[terraform]: - Installed digitalocean/digitalocean v2.14.0 (signed by a HashiCorp partner, key ID F82037E524B9C0E8)
[terraform]: - Installing hashicorp/random v3.1.0...
[terraform]: - Installed hashicorp/random v3.1.0 (signed by HashiCorp)
[terraform]: - Installing hashicorp/tls v3.1.0...
[terraform]: - Installed hashicorp/tls v3.1.0 (signed by HashiCorp)
[terraform]: 
[terraform]: Partner and community providers are signed by their developers.
[terraform]: If you'd like to know more about provider signing, you can read about it here:
[terraform]: https://www.terraform.io/docs/cli/plugins/signing.html
[terraform]: 
[terraform]: Terraform has created a lock file .terraform.lock.hcl to record the provider
[terraform]: selections it made above. Include this file in your version control repository
[terraform]: so that Terraform can guarantee to make the same selections by default when
[terraform]: you run "terraform init" in the future.
[terraform]: 
[terraform]: ╷
[terraform]: │ Warning: Additional provider information from registry
[terraform]: │ 
[terraform]: │ The remote registry returned warnings for
[terraform]: │ registry.terraform.io/hashicorp/kubernetes-alpha:
[terraform]: │ - Please do not rely on this provider for production use while we strive
[terraform]: │ towards project maturity.
[terraform]: │ https://github.com/hashicorp/terraform-provider-kubernetes-alpha#experimental-status
[terraform]: ╵
[terraform]: 
[terraform]: Terraform has been successfully initialized!
[terraform]: 
[terraform]: You may now begin working with Terraform. Try running "terraform plan" to see
[terraform]: any changes that are required for your infrastructure. All Terraform commands
[terraform]: should now work.
[terraform]: 
[terraform]: If you ever set or change modules or backend configuration for Terraform,
[terraform]: rerun this command to reinitialize your working directory. If you forget, other
[terraform]: commands will detect it and remind you to do so if necessary.
INFO:qhub.provider.terraform:terraform init took 6.765 [s]
INFO:qhub.provider.terraform:terraform apply directory=infrastructure targets=['module.kubernetes', 'module.kubernetes-initialization']
INFO:qhub.provider.terraform: terraform at /tmp/terraform/1.0.5/terraform
[terraform]: module.kubernetes.digitalocean_kubernetes_cluster.main: Refreshing state... [id=1bd99a83-c497-490e-8900-a0f713aa1452]
[terraform]: module.kubernetes.digitalocean_kubernetes_node_pool.main[0]: Refreshing state... [id=b97cd9ab-89f3-4dc0-b25e-90e6e7e63da1]
[terraform]: module.kubernetes.digitalocean_kubernetes_node_pool.main[1]: Refreshing state... [id=340cea47-1e1d-4785-9687-a9a4f6a63bb0]
[terraform]: module.kubernetes-initialization.kubernetes_namespace.main: Refreshing state... [id=dev]
[terraform]: ╷
[terraform]: │ Warning: Resource targeting is in effect
[terraform]: │ 
[terraform]: │ You are creating a plan with the -target option, which means that the
[terraform]: │ result of this plan may not represent all of the changes requested by the
[terraform]: │ current configuration.
[terraform]: │ 		
[terraform]: │ The -target option is not for routine use, and is provided only for
[terraform]: │ exceptional situations such as recovering from errors or mistakes, or when
[terraform]: │ Terraform specifically suggests to use it as part of an error message.
[terraform]: ╵
[terraform]: ╷
[terraform]: │ Error: Get "http://localhost/api/v1/namespaces/dev": dial tcp [::1]:80: connect: connection refused
[terraform]: │ 
[terraform]: │   with module.kubernetes-initialization.kubernetes_namespace.main,
[terraform]: │   on modules/kubernetes/initialization/main.tf line 1, in resource "kubernetes_namespace" "main":
[terraform]: │    1: resource "kubernetes_namespace" "main" {
[terraform]: │ 
[terraform]: ╵
Problem encountered: Terraform error
Cleaning up project directory and file based variables 00:01
ERROR: Job failed: exit code 1
</detail>

Versions and dependencies used.

conda==4.10.3 configured via qhub
kubernetes==1.21.9-do.0 configured via qhub
qhub==0.3.14

Compute environment

Digital Ocean

Integrations

No response

Anything else?

@iameskild

@MaxTechniche MaxTechniche added the type: bug 🐛 Something isn't working label Feb 26, 2022
@iameskild iameskild self-assigned this Feb 26, 2022
@trallard trallard added needs: follow-up 📫 Someone needs to get back to this issue or PR needs: investigation 🔍 Someone in the team needs to find the root cause and replicate this bug provider: Digital Ocean labels Feb 28, 2022
@trallard trallard moved this to Needs Triage 🔍 in QHub Project Mangement 🚀 Feb 28, 2022
@trallard trallard moved this from Needs Triage 🔍 to Follow-up 📣 in QHub Project Mangement 🚀 Feb 28, 2022
@iameskild
Copy link
Member

Hi @MaxTechniche, sorry that I have been able to get to this yet. I will try to recreate this issue tomorrow and we can go from there! Thank you for your patience.

@iameskild
Copy link
Member

Hey @MaxTechniche, so the good news is that I was able to reproduce the error. The bad news is that after many attempts to get it to work using purely QHub commands, I was not able to get it work.

I did find a few "hacky" workaround which all end up getting reverted after the following deployment.

I'll keep investigating.

Repository owner moved this from Follow-up 📣 to Done 💪🏾 in QHub Project Mangement 🚀 Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: follow-up 📫 Someone needs to get back to this issue or PR needs: investigation 🔍 Someone in the team needs to find the root cause and replicate this bug provider: Digital Ocean type: bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants