Skip to content
This repository has been archived by the owner on Feb 5, 2020. It is now read-only.

Tectonic console is not accessible when deployed with calico networking #3124

Open
yuanlinios opened this issue Mar 17, 2018 · 1 comment
Open

Comments

@yuanlinios
Copy link

What keywords did you search in tectonic-installer issues before filing this one?

calico, console

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

Tectonic version (release or commit hash):

1.8.7-tectonic.2

Terraform version (terraform version):

0.10.7

Platform (aws|azure|openstack|metal|vmware):

metal

What happened?

Install tectonic on bare metal with calico networking. The deployment is successful (I can access the cluster with kubectl CLI without any issue). But the tectonic console is not accessible, nginx 504 gateway timeout

What you expected to happen?

The tectonic console should be accessible.

How to reproduce it (as minimally and precisely as possible)?

My terraform.tfvars:

tectonic_base_domain = "lab.local"
tectonic_ca_cert = "..."
tectonic_ca_key = "..."
tectonic_cluster_cidr = "10.2.0.0/16"
tectonic_cluster_name = "tectonic"
tectonic_container_linux_channel = "stable"
tectonic_container_linux_version = "1632.3.0"
tectonic_custom_ca_pem_list = ["...", "...", "..."]
tectonic_license_path = "/path/to/tectonic-license.txt"
tectonic_metal_controller_domain = "controller.lab.local"
tectonic_metal_controller_domains = ["core01.lab.local", "core06.lab.local"]
tectonic_metal_controller_macs = ["xx:xx:xx:xx:xx:xx", "xx:xx:xx:xx:xx:xx"]
tectonic_metal_controller_names = ["core01", "core06"]
tectonic_metal_ingress_domain = "tectonic.lab.local"
tectonic_metal_matchbox_ca = "..."
tectonic_metal_matchbox_client_cert = "..."
tectonic_metal_matchbox_client_key = "..."
tectonic_metal_matchbox_http_url = "http://matchbox.lab.local:8080"
tectonic_metal_matchbox_rpc_endpoint = "matchbox.lab.local:8081"
tectonic_metal_worker_domains = ["core02.lab.local", "core07.lab.local", "core03.lab.local", "core08.lab.local", "core04.lab.local", "core09.lab.local"]
tectonic_metal_worker_macs = ["xx:xx:xx:xx:xx:xx", "xx:xx:xx:xx:xx:xx", "xx:xx:xx:xx:xx:xx", "xx:xx:xx:xx:xx:xx", "xx:xx:xx:xx:xx:xx", "xx:xx:xx:xx:xx:xx"]
tectonic_metal_worker_names = ["core02", "core07", "core03", "core08", "core04", "core09"]
tectonic_metal_calico_mtu = 1500
tectonic_networking = "calico"
tectonic_ntp_servers = ["ntp.lab.local"]
tectonic_pull_secret_path = "/path/to/config.json.txt"
tectonic_service_cidr = "10.3.0.0/16"
tectonic_ssh_authorized_key = "..."
tectonic_vanilla_k8s = false

After the cluster is deployed, I cannot open the tectonic console (nginx 504 gateway timeout). But I can access the cluster from kubectl CLI. Some simple check output

kubectl get node
NAME                   STATUS    ROLES     AGE       VERSION
core01a.lab.local   Ready     master    6h        v1.8.7+coreos.0
core02a.lab.local   Ready     node      6h        v1.8.7+coreos.0
core03a.lab.local   Ready     node      6h        v1.8.7+coreos.0
core04a.lab.local   Ready     node      6h        v1.8.7+coreos.0
core06a.lab.local   Ready     master    6h        v1.8.7+coreos.0
core07a.lab.local   Ready     node      6h        v1.8.7+coreos.0
core08a.lab.local   Ready     node      6h        v1.8.7+coreos.0
core09a.lab.local   Ready     node      6h        v1.8.7+coreos.0

kubectl -n tectonic-system get deployment
NAME                                    DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
alm-operator                            1         1         1            1           6h
catalog-operator                        1         1         1            1           6h
container-linux-update-operator         1         1         1            1           6h
default-http-backend                    1         1         1            1           6h
etcd-operator                           1         1         1            1           6h
grafana                                 1         1         1            1           6h
kube-state-metrics                      1         1         1            1           6h
kube-version-operator                   1         1         1            1           6h
prometheus-operator                     1         1         1            1           6h
tectonic-alm-operator                   1         1         1            1           6h
tectonic-channel-operator               1         1         1            1           6h
tectonic-cluo-operator                  1         1         1            1           6h
tectonic-console                        2         2         2            2           49m
tectonic-identity                       2         2         2            2           6h
tectonic-monitoring-auth-alertmanager   1         1         1            1           6h
tectonic-monitoring-auth-grafana        1         1         1            1           6h
tectonic-monitoring-auth-prometheus     1         1         1            1           6h
tectonic-prometheus-operator            1         1         1            1           6h
tectonic-stats-emitter                  1         1         1            1           6h

Try from outside with curl -I:

curl -k -I https://tectonic.lab.local:443
HTTP/1.1 200 OK
Server: nginx/1.13.6
Date: Sat, 17 Mar 2018 12:57:10 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Vary: Accept-Encoding
Set-Cookie: tectonic-affinity=779c2141783d79cac81caca05f637ffd447abbe8; Path=/; HttpOnly
Set-Cookie: csrf-token=7dhTLo46saEvQxw5OUQGc5skJY87KCfA+aXl6tsPI2GW6Qnr2fAa1t/e/fc641RZUD6hVS34gXKt71bZHimksA==; Path=/; Secure
Strict-Transport-Security: max-age=15724800; includeSubDomains;

Logs from ingress controller shows the upstream console service is not accessible:

kubectl -n tectonic-system logs tectonic-ingress-controller-XXXX 

2018/03/17 06:39:35 [error] 532#532: *10 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.65.150.30, server: tectonic.lab.local, request: "GET / HTTP/1.1", upstream: "http://10.2.0.6:8080/", host: "tectonic.lab.local"
2018/03/17 06:40:35 [error] 532#532: *10 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.65.150.30, server: tectonic.lab.local, request: "GET / HTTP/1.1", upstream: "http://10.2.4.7:8080/", host: "tectonic.lab.local"

I deployed some pods to validate inter-connectivity across hosts, but no problem identified. The troubleshooting guide https://coreos.com/tectonic/docs/latest/troubleshooting/troubleshooting.html does not give any help in my case.

No tectonic console access issue for deployment with flannel.

@RonnyMaas
Copy link

I had issues with calico as well, fixed it with firewall rule (protocol 4 for IPinIP) and a lower mtu size (1480) due to the extra 20 byte header that the tunnel will add to each packet.

Not sure if it fixes your problem also, but I see you have tectonic_metal_calico_mtu = 1500 in terraform.tfvars.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants