Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vizier-db and vizier-core cannot running #2513

Closed
Chris-Paul-Li opened this issue Feb 20, 2019 · 3 comments
Closed

vizier-db and vizier-core cannot running #2513

Chris-Paul-Li opened this issue Feb 20, 2019 · 3 comments

Comments

@Chris-Paul-Li
Copy link

enviroment:
JDCloud
kubeflow v0.4.1

When the "${KUBEFLOW_SRC}/scripts/kfctl.sh apply k8s" command is run,vizier-db and vizier-core cannot running

kubectl -n kubeflow get pods
NAME READY STATUS RESTARTS AGE
ambassador-dbd958d59-2wbfn 1/1 Running 0 6h
ambassador-dbd958d59-l5drm 1/1 Running 0 7h
ambassador-dbd958d59-p2v5x 1/1 Running 0 7h
argo-ui-5767d9df4d-8cbj5 1/1 Running 0 7h
centraldashboard-bc8f94fd4-tq895 1/1 Running 0 6h
jupyter-0 1/1 Running 1 6h
katib-ui-846c5747bd-7zjrp 1/1 Running 0 6h
metacontroller-0 1/1 Running 0 7h
minio-65f87f878f-g5kdt 0/1 Pending 0 7h
ml-pipeline-86c74b985c-c7x8l 0/1 ImagePullBackOff 0 7h
ml-pipeline-86db479d7-95fgx 0/1 ImagePullBackOff 0 6h
ml-pipeline-persistenceagent-67f7c88598-xq8x4 0/1 ImagePullBackOff 0 6h
ml-pipeline-persistenceagent-7ccc698898-lkqxq 0/1 ImagePullBackOff 0 7h
ml-pipeline-scheduledworkflow-589cd766b-fvbdb 0/1 ImagePullBackOff 0 6h
ml-pipeline-scheduledworkflow-6cd4b586bb-g6z67 0/1 ImagePullBackOff 0 7h
ml-pipeline-ui-5f6d9559f6-xgqk9 0/1 ImagePullBackOff 0 7h
ml-pipeline-ui-846cf9c488-9hktj 0/1 ImagePullBackOff 0 6h
mysql-79b576796-tqrck 0/1 Pending 0 7h
pytorch-operator-6d86c657bc-cjgm6 1/1 Running 0 6h
studyjob-controller-89cc94b64-n6xzz 1/1 Running 0 6h
tf-job-dashboard-dc8b95b66-mt5c4 1/1 Running 0 6h
tf-job-operator-v1beta1-95948df74-665fq 1/1 Running 0 6h
vizier-core-b6c778464-2x6qt 0/1 CrashLoopBackOff 131 6h
vizier-core-rest-f8577c54-sq7k9 1/1 Running 0 6h
vizier-db-588d5cd7cb-z8v9j 0/1 Pending 0 3m
vizier-suggestion-bayesianoptimization-789b65558b-j7x9z 1/1 Running 0 6h
vizier-suggestion-grid-68ff96c4b6-fljcd 1/1 Running 0 6h
vizier-suggestion-hyperband-77b67f56d8-cvh2x 1/1 Running 0 6h
vizier-suggestion-random-79cdc5cfd-r7xvs 1/1 Running 0 6h
workflow-controller-657666fd66-vknhk 1/1 Running 0 7h

root@kubeflow-test-tianqi:~/kubeflow/ks_app/ks_app# kubectl -n kubeflow describe pod vizier-db-588d5cd7cb-z8v9j

**

Name: vizier-db-588d5cd7cb-z8v9j
Namespace: kubeflow
Node:
Labels: app=vizier
component=db
pod-template-hash=1448178376
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kubeflow","name":"vizier-db-588d5cd7cb","uid":"c0cf47b2-34da-11e9-845a-fa163e4656...
Status: Pending
IP:
Controlled By: ReplicaSet/vizier-db-588d5cd7cb
Containers:
vizier-db:
Image: mysql:8.0.3
Port: 3306/TCP
Args:
--datadir
/var/lib/mysql/datadir
Readiness: exec [/bin/bash -c mysql -D $$MYSQL_DATABASE -p$$MYSQL_ROOT_PASSWORD -e 'SELECT 1'] delay=5s timeout=1s period=2s #success=1 #failure=3
Environment:
MYSQL_ROOT_PASSWORD: <set to the key 'MYSQL_ROOT_PASSWORD' in secret 'vizier-db-secrets'> Optional: false
MYSQL_ALLOW_EMPTY_PASSWORD: true
MYSQL_DATABASE: vizier
Mounts:
/var/lib/mysql from katib-mysql (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pt7pz (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
katib-mysql:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: katib-mysql
ReadOnly: false
default-token-pt7pz:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pt7pz
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message


Warning FailedScheduling 1m (x72 over 22m) default-scheduler PersistentVolumeClaim is not bound: "katib-mysql" (repeated 4 times)
**

error is Warning FailedScheduling 1m (x72 over 22m) default-scheduler PersistentVolumeClaim is not bound: "katib-mysql" (repeated 4 times)

**
root@kubeflow-test-tianqi:/kubeflow/ks_app/ks_app# kubectl -n kubeflow get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
katib-mysql 10Gi RWO Retain Available 7h
pipeline-minio-pv 10Gi RWO Retain Available 7h
pipeline-mysql-pv 10Gi RWO Retain Available 7h
vizier-pv 20Gi RWO Retain Available 20m
root@kubeflow-test-tianqi:
/kubeflow/ks_app/ks_app# kubectl -n kubeflow get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
123-workspace Pending default 7h
katib-mysql Pending default 7h
minio-pv-claim Pending default 7h
mysql-pv-claim Pending default 7h

**

and describe pvc of katib-mysql

**
root@kubeflow-test-tianqi:~/kubeflow/ks_app/ks_app# kubectl -n kubeflow describe pvc katib-mysql
Name: katib-mysql
Namespace: kubeflow
StorageClass: default
Status: Pending
Volume:
Labels: app=katib
app.kubernetes.io/deploy-manager=ksonnet
ksonnet.io/component=katib
Annotations: ksonnet.io/managed={"pristine":"H4sIAAAAAAAA/zyOsU40MQyE+/8xpt4fuHZbCioEojgKROHNDii6JM7FXhA65d1RFkE3ns/67AukxiObRS2Y8XHAhFMsK2Y8jtacxY+atszbJDFjQqbLKi6YL0iyMNlIUitmnMTjMhSmpdCvol4HzVULi//hPqFI5u/8P3/Z...
volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/jdcloud-ebs
Finalizers: []
Capacity:
Access Modes:
Events:
Type Reason Age From Message


Warning ProvisioningFailed 4m (x1883 over 7h) persistentvolume-controller Failed to provision volume with StorageClass "default": claim.Spec.Selector is not supported for dynamic provisioning on AWS
**

@pdmack
Copy link
Member

pdmack commented Feb 20, 2019

Hmmm, can you switch to a different StorageClass? Or manually create a PV?

@Chris-Paul-Li
Copy link
Author

@pdmack Thank you, after changing the storageclass, the problem is solved.

@ghost
Copy link

ghost commented Mar 7, 2019

@Chris-Paul-Li having the same issue with my initial deployment. Where and to what did you change the storage class to solve the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants