Skip to content
This repository has been archived by the owner on Feb 16, 2019. It is now read-only.

Istio on GKE installation instructions produce broken cluster #262

Open
danderson opened this issue Mar 30, 2018 · 14 comments
Open

Istio on GKE installation instructions produce broken cluster #262

danderson opened this issue Mar 30, 2018 · 14 comments

Comments

@danderson
Copy link

Is this a BUG or FEATURE REQUEST?:

Bug.

Did you review https://istio.io/help/ and existing issues to identify if this is already solved or being worked on?:

Yes, reviewed. No, doesn't help.

Bug:
Y

What Version of Istio and Kubernetes are you using, where did you get Istio from, Installation details

Version: 0.6.0
GitRevision: 2cb09cdf040a8573330a127947b11e5082619895
User: root@a28f609ab931
Hub: docker.io/istio
GolangVersion: go1.9
BuildStatus: Clean
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:55:54Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.2-gke.1", GitCommit:"4ce7af72d8d343ea2f7680348852db641ff573af", GitTreeState:"clean", BuildDate:"2018-01-31T22:30:55Z", GoVersion:"go1.9.2b4", Compiler:"gc", Platform:"linux/amd64"}

Is Istio Auth enabled or not ?
Deployed a GKE cluster with istio, via Deployment Manager, per https://istio.io/docs/setup/kubernetes/quick-start-gke-dm.html . All deployment manager tweakables and checkboxes left at their default value, I just clicked straight through to "deploy".

What happened:
After much waiting, the "config waiter" stage of the deployment times out and fails.

The GKE cluster is up and working, but the Istio sidecar injector is looping on trying (and failing) to start.

$ kubectl get po -n istio-system
NAME                                      READY     STATUS              RESTARTS   AGE
grafana-89f97d9c-6lkmp                    1/1       Running             0          9m
istio-ca-59f6dcb7d9-wwc2x                 1/1       Running             0          19m
istio-ingress-56dd45597b-6qpbz            1/1       Running             0          19m
istio-mixer-7f5dcf8db4-kzlpm              3/3       Running             0          19m
istio-pilot-7ddb95dc8f-lsr8b              2/2       Running             0          19m
istio-sidecar-injector-7947777478-kthf9   0/1       ContainerCreating   0          19m
prometheus-cf8456855-dt66q                1/1       Running             0          9m
servicegraph-59ff5dbbff-t7s5x             1/1       Running             0          9m
zipkin-7988c559b7-m82z8                   1/1       Running             0          9m

It would appear that there is a missing secret, and after 30+ minutes nothing seems to be interested in creating that secret:

$ kubectl describe -n istio-system po istio-sidecar-injector-7947777478-kthf9
Name:           istio-sidecar-injector-7947777478-kthf9
Namespace:      istio-system
Node:           gke-istio-cluster-default-pool-03b26a19-f9p9/10.128.0.6
Start Time:     Fri, 30 Mar 2018 14:11:57 -0700
Labels:         istio=sidecar-injector
                pod-template-hash=3503333034
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  ReplicaSet/istio-sidecar-injector-7947777478
Containers:
  webhook:
    Container ID:  
    Image:         docker.io/istio/sidecar_injector:0.6.0
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Args:
      --tlsCertFile=/etc/istio/certs/cert.pem
      --tlsKeyFile=/etc/istio/certs/key.pem
      --injectConfig=/etc/istio/inject/config
      --meshConfig=/etc/istio/config/mesh
      --healthCheckInterval=2s
      --healthCheckFile=/health
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       exec [/usr/local/bin/sidecar-injector probe --probe-path=/health --interval=2s] delay=4s timeout=1s period=4s #success=1 #failure=3
    Readiness:      exec [/usr/local/bin/sidecar-injector probe --probe-path=/health --interval=2s] delay=4s timeout=1s period=4s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /etc/istio/certs from certs (ro)
      /etc/istio/config from config-volume (ro)
      /etc/istio/inject from inject-config (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from istio-sidecar-injector-service-account-token-5kxkh (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      istio
    Optional:  false
  certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  sidecar-injector-certs
    Optional:    false
  inject-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      istio-inject
    Optional:  false
  istio-sidecar-injector-service-account-token-5kxkh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  istio-sidecar-injector-service-account-token-5kxkh
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age                From                                                   Message
  ----     ------                 ----               ----                                                   -------
  Normal   Scheduled              20m                default-scheduler                                      Successfully assigned istio-sidecar-injector-7947777478-kthf9 to gke-istio-cluster-default-pool-03b26a19-f9p9
  Normal   SuccessfulMountVolume  20m                kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp succeeded for volume "inject-config"
  Normal   SuccessfulMountVolume  20m                kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp succeeded for volume "config-volume"
  Normal   SuccessfulMountVolume  20m                kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp succeeded for volume "istio-sidecar-injector-service-account-token-5kxkh"
  Warning  FailedMount            2m (x17 over 20m)  kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp failed for volume "certs" : secrets "sidecar-injector-certs" not found
  Warning  FailedMount            46s (x9 over 18m)  kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  Unable to mount volumes for pod "istio-sidecar-injector-7947777478-kthf9_istio-system(faf13902-345e-11e8-a19e-42010a8000c0)": timeout expired waiting for volumes to attach/mount for pod "istio-system"/"istio-sidecar-injector-7947777478-kthf9". list of unattached/unmounted volumes=[certs]

What you expected to happen:

Istio on GKE should install correctly when following official instructions.

How to reproduce it:

Follow installation instructions at https://istio.io/docs/setup/kubernetes/quick-start-gke-dm.html .

@danderson
Copy link
Author

This smells a lot like #261 , in that the failure mode is identical - the secret is never created. I find this surprising, because it implies that deployment manager is using a v1.10 kubectl under the hood, which I did not believe to be true.

@jsenon
Copy link
Member

jsenon commented Apr 3, 2018

If your are stuck it will works with kubectl version 1.9

@linsun
Copy link
Member

linsun commented Apr 4, 2018

agree it does sound like #261 cc @ayj for triage.

@ayj ayj removed their assignment Apr 4, 2018
@ayj
Copy link

ayj commented Apr 4, 2018

cc @selmanj - it looks like DM config might need to be updated to account for a change in behaviot in kubectl v1.10 (see kubernetes/kubectl#384 and stio/issues#261).

@ayj ayj self-assigned this Apr 4, 2018
@selmanj
Copy link

selmanj commented Apr 4, 2018

The DM template has an apt-get update && apt-get install -y git curl kubectl step at the beginning, so it's likely it's using the affected version of kubectl.

We could either force the install to use a previous version or patch ./install/kubernetes/webhook-patch-ca-bundle.sh. @ayj what do you think?

@ayj
Copy link

ayj commented Apr 4, 2018

install/kubernetes/webhook-patch-ca-bundle.sh is going away in 0.8. @yusuoh replaced it with automatic cert provisioning using Istio CA.

If this needs to be patched for 0.7.0 pinning to a specific version might be easiest. Otherwise you'll need to add conditional checks in the scripts to optional prepend version/kind info.

@selmanj
Copy link

selmanj commented Apr 4, 2018

I don't think the DM template is updated for 0.7.0. I'll do that in a separate PR.

Let me see about pinning to a previous kubectl to avoid this issue for now.

@selmanj
Copy link

selmanj commented Apr 5, 2018

I'm unable to reproduce the issue. As an additional item, kubectl installed on the instance is version 1.7.5, not 1.10, so it seems the issue must be something else.

jsselman@istio-cluster-2-istio-cluster-2-vm:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017
-08-31T09:14:02Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?
jsselman@istio-cluster-2-istio-cluster-2-vm:~$ apt-cache showpkg kubectl
Package: kubectl
Versions: 
1.7.5-00 (/var/lib/apt/lists/packages.cloud.google.com_apt_dists_cloud-sdk-jessie_main_binary-amd64_Packages) (/var/lib/dpkg/status)
 Description Language: 
                 File: /var/lib/apt/lists/packages.cloud.google.com_apt_dists_cloud-sdk-jessie_main_binary-amd64_Packages
                  MD5: fb58ab85a9089d0257cb8f7cda7d5a09

@selmanj
Copy link

selmanj commented Apr 5, 2018

I did notice a few issues with the script itself; for example it's using a version of GKE that is no longer supported, and the debian image used for the installer vm is outdated. Will send out PRs to fix those before continuing to investigate.

@selmanj
Copy link

selmanj commented Apr 5, 2018

Update; after resolving the issues on my local branch I was able to reproduce the issue; it DOES seem to be caused by the affected kubectl version (I'm not sure how I didn't run into it when I previously looked - maybe due to the older image?)

I'll update the script to use the previously-released kubectl and then send out a PR.

@renperez
Copy link

renperez commented Apr 5, 2018

i'm running kubectl 1.10.0 and i am seeing this error.
Unable to mount volumes for pod "istio-sidecar-injector-6ff9fb5698-82kpg_istio-system(58d89fa2-3929-11e8-a486-069e57407dac)": timeout expired waiting for volumes to attach/mount for pod "istio-system"/"istio-sidecar-injector-6ff9fb5698-82kpg". list of unattached/unmounted volumes=[certs]

MountVolume.SetUp failed for volume "certs" : secrets "sidecar-injector-certs" not found

@renperez
Copy link

renperez commented Apr 5, 2018

==> v2beta1/HorizontalPodAutoscaler
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
istio-ingress Deployment/istio-ingress / 80% 2 8 2 3h

==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
istio-ingress 2 2 2 2 3h
istio-mixer 1 1 1 1 3h
istio-pilot 1 1 1 1 3h
istio-ca 1 1 1 1 3h
istio-sidecar-injector 1 1 1 0 3h

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
istio-ingress-6d448d77f-ngcr4 1/1 Running 0 3h
istio-ingress-6d448d77f-rzdkk 1/1 Running 0 3h
istio-mixer-84bcf5f54-89rwr 3/3 Running 0 3h
istio-pilot-6fdbbb5456-7llw8 2/2 Running 0 3h
istio-ca-994b7849-cqj5j 1/1 Running 0 3h
istio-sidecar-injector-6ff9fb5698-82kpg 0/1 ContainerCreating 0 7m

istio-merge-robot pushed a commit to istio/istio that referenced this issue Apr 6, 2018
Automatic merge from submit-queue.

Update to latest supported GKE version

According to https://cloud.google.com/kubernetes-engine/release-notes, the current version used is no longer supported; here we update to the next available 1.9 version.

Related to istio/old_issues_repo#262

/cc @ayj
@renperez
Copy link

renperez commented Apr 7, 2018

any update on this? this is also broken on helm install.

@selmanj
Copy link

selmanj commented Apr 7, 2018

Once #4781 is merged in, the GKE template should work; I'll let someone else comment regarding the helm install.

istio-merge-robot pushed a commit to istio/istio that referenced this issue Apr 11, 2018
Automatic merge from submit-queue.

Use kubectl 1.9.6 in GCP Deployment Manager install

Works around issue discovered in istio/old_issues_repo#262 by forcing an earlier version of kubectl. I also updated the debian image to a family to avoid a warning in the DM UI.

Also see #4759 which uses a more recent version of GKE.

/cc @ayj
@ayj ayj removed their assignment Apr 25, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants