Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deployment on libvirt hanging forever #393

Closed
karmab opened this issue Oct 2, 2018 · 6 comments
Closed

deployment on libvirt hanging forever #393

karmab opened this issue Oct 2, 2018 · 6 comments

Comments

@karmab
Copy link
Contributor

karmab commented Oct 2, 2018

Versions

Tectonic version (release or commit hash):

41077a9d818756c16ffbf073499a7ccef0a79c60

Terraform version (terraform version):

Terraform v0.11.8

Platform (aws|libvirt):

libvirt

What happened?

deployed with libvirt but deployment is not finishing

What you expected to happen?

deployment successfull

How to reproduce it (as minimally and precisely as possible)?

deploy with libvirt

Anything else we need to know?

i deployed with a customized libvirt uri qemu+ssh://root@192.168.122.1/system
also used Red Hat CoreOS release 3.10 for the image
and it worked well, ie boostrap and master0 nodes are there, with the appropriate ip but the deployment is stuck on the following

-- Logs begin at Tue 2018-10-02 19:58:10 UTC. --
Oct 02 20:19:14 testk-bootstrap bootkube.sh[21652]: Found Machine Config Operator's image: registry.svc.ci.openshift.org/openshift/origin-v4.0-20181002182152@sha256:d2c602936193f81a808d477b065c01dab18d15914b2ec71c380f310d4a44b441
Oct 02 20:19:14 testk-bootstrap bootkube.sh[21652]: Rendering MCO manifests...
Oct 02 20:19:15 testk-bootstrap tectonic.sh[888]: kubectl --namespace kube-system get pods --output custom-columns=STATUS:.status.phase,NAME:.metadata.name --no-headers=true failed. Retrying in 5 seconds...
Oct 02 20:19:15 testk-bootstrap bootkube.sh[21652]: Trying to pull registry.svc.ci.openshift.org/openshift/origin-v4.0-20181002182152@sha256:d2c602936193f81a808d477b065c01dab18d15914b2ec71c380f310d4a44b441...Failed
Oct 02 20:19:15 testk-bootstrap bootkube.sh[21652]: unable to find image: unable to pull registry.svc.ci.openshift.org/openshift/origin-v4.0-20181002182152@sha256:d2c602936193f81a808d477b065c01dab18d15914b2ec71c380f310d4a44b441
Oct 02 20:19:15 testk-bootstrap systemd[1]: bootkube.service: main process exited, code=exited, status=125/n/a
Oct 02 20:19:15 testk-bootstrap systemd[1]: Unit bootkube.service entered failed state.
Oct 02 20:19:15 testk-bootstrap systemd[1]: bootkube.service failed.
Oct 02 20:19:20 testk-bootstrap tectonic.sh[888]: kubectl --namespace kube-system get pods --output custom-columns=STATUS:.status.phase,NAME:.metadata.name --no-headers=true failed. Retrying in 5 seconds...
Oct 02 20:19:21 testk-bootstrap systemd[1]: bootkube.service holdoff time over, scheduling restart.
Oct 02 20:19:24 testk-bootstrap systemd[1]: Started Bootstrap a Kubernetes cluster.
Oct 02 20:19:24 testk-bootstrap systemd[1]: Starting Bootstrap a Kubernetes cluster...
Oct 02 20:19:25 testk-bootstrap bootkube.sh[21833]: Found Machine Config Operator's image: registry.svc.ci.openshift.org/openshift/origin-v4.0-20181002182152@sha256:d2c602936193f81a808d477b065c01dab18d15914b2ec71c380f310d4a44b441
Oct 02 20:19:25 testk-bootstrap bootkube.sh[21833]: Rendering MCO manifests...
Oct 02 20:19:26 testk-bootstrap tectonic.sh[888]: kubectl --namespace kube-system get pods --output custom-columns=STATUS:.status.phase,NAME:.metadata.name --no-headers=true failed. Retrying in 5 seconds...
Oct 02 20:19:26 testk-bootstrap bootkube.sh[21833]: Trying to pull registry.svc.ci.openshift.org/openshift/origin-v4.0-20181002182152@sha256:d2c602936193f81a808d477b065c01dab18d15914b2ec71c380f310d4a44b441...Failed
Oct 02 20:19:26 testk-bootstrap bootkube.sh[21833]: unable to find image: unable to pull registry.svc.ci.openshift.org/openshift/origin-v4.0-20181002182152@sha256:d2c602936193f81a808d477b065c01dab18d15914b2ec71c380f310d4a44b441
Oct 02 20:19:26 testk-bootstrap systemd[1]: bootkube.service: main process exited, code=exited, status=125/n/a
Oct 02 20:19:26 testk-bootstrap systemd[1]: Unit bootkube.service entered failed state.
Oct 02 20:19:26 testk-bootstrap systemd[1]: bootkube.service failed.
Oct 02 20:19:31 testk-bootstrap tectonic.sh[888]: kubectl --namespace kube-system get pods --output custom-columns=STATUS:.status.phase,NAME:.metadata.name --no-headers=true failed. Retrying in 5 seconds...
Oct 02 20:19:31 testk-bootstrap systemd[1]: bootkube.service holdoff time over, scheduling restar



References

enter text here
@crawford
Copy link
Contributor

crawford commented Oct 2, 2018

It looks like your cluster isn't able to fetch images from registry.svc.ci.openshift.org. Are you sure your pull secret is valid?

@karmab
Copy link
Contributor Author

karmab commented Oct 2, 2018

yep, pull secret is valid, i was able to go a little bit further by using a more recent version of rhcos and now cluster fails on bootstrapping etcd

@karmab
Copy link
Contributor Author

karmab commented Oct 2, 2018

current issue:

Oct 02 22:00:14 testk-bootstrap bootkube.sh[794]: Copying blob sha256:c15c14574a0bc94fb65cb906baae5debd103dd02991f3449adaa639441b7dde4
Oct 02 22:00:15 testk-bootstrap bootkube.sh[794]: [31B blob data]
Oct 02 22:00:15 testk-bootstrap bootkube.sh[794]: Skipping fetch of repeat blob sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Oct 02 22:00:15 testk-bootstrap bootkube.sh[794]: Skipping fetch of repeat blob sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Oct 02 22:00:15 testk-bootstrap bootkube.sh[794]: Writing manifest to image destination
Oct 02 22:00:15 testk-bootstrap bootkube.sh[794]: Storing signatures
Oct 02 22:10:17 testk-bootstrap bootkube.sh[794]: https://testk-etcd-0.tt.testing:2379 is unhealthy: failed to connect: dial tcp 192.168.124.11:2379: getsockopt: connection refused
Oct 02 22:10:17 testk-bootstrap bootkube.sh[794]: Error:  unhealthy cluster
Oct 02 22:10:17 testk-bootstrap bootkube.sh[794]: etcdctl failed. Retrying in 5 seconds...
Oct 02 22:20:23 testk-bootstrap bootkube.sh[794]: https://testk-etcd-0.tt.testing:2379 is unhealthy: failed to connect: dial tcp 192.168.124.11:2379: getsockopt: connection refused
Oct 02 22:20:23 testk-bootstrap bootkube.sh[794]: Error:  unhealthy cluster
Oct 02 22:20:23 testk-bootstrap bootkube.sh[794]: etcdctl failed. Retrying in 5 seconds...
Oct 02 22:30:29 testk-bootstrap bootkube.sh[794]: https://testk-etcd-0.tt.testing:2379 is unhealthy: failed to connect: dial tcp 192.168.124.11:2379: getsockopt: connection refused
Oct 02 22:30:29 testk-bootstrap bootkube.sh[794]: Error:  unhealthy cluster
Oct 02 22:30:29 testk-bootstrap bootkube.sh[794]: etcdctl failed. Retrying in 5 seconds...
Oct 02 22:40:34 testk-bootstrap bootkube.sh[794]: https://testk-etcd-0.tt.testing:2379 is unhealthy: failed to connect: dial tcp 192.168.124.11:2379: getsockopt: connection refused
Oct 02 22:40:34 testk-bootstrap bootkube.sh[794]: Error:  unhealthy cluster
Oct 02 22:40:34 testk-bootstrap bootkube.sh[794]: etcdctl failed. Retrying in 5 seconds...
Oct 02 22:50:40 testk-bootstrap bootkube.sh[794]: https://testk-etcd-0.tt.testing:2379 is unhealthy: failed to connect: dial tcp 192.168.124.11:2379: getsockopt: connection refused
Oct 02 22:50:40 testk-bootstrap bootkube.sh[794]: Error:  unhealthy cluster
Oct 02 22:50:40 testk-bootstrap bootkube.sh[794]: etcdctl failed. Retrying in 5 seconds...

@crawford
Copy link
Contributor

crawford commented Oct 2, 2018

We think this may be fixed by openshift/machine-config-operator#106 (root cause being opencontainers/runc#1807). If you wait for the image to replicate before trying again, you should have more luck.

@karmab
Copy link
Contributor Author

karmab commented Oct 2, 2018

thanks!

@karmab
Copy link
Contributor Author

karmab commented Oct 3, 2018

fixed with a more recent version

@karmab karmab closed this as completed Oct 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants