-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[aws] pulling ocp-release images includes @sha256@sha256 on latest installer 0.9.1 causing installation break #1066
Comments
Dup of #933 and #1032. We're just waiting on a newer RHCOS for Podman 1.0 and containers/podman#2106. /close |
@wking: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
You comment in #933?
Destroy it and launch a new cluster. I dunno why some folks see this more than others, but you can apply #1032 locally if you get tired of it and can't wait for a RHCOS with Podman 1.0 and its fix. |
@wking I replied on the other issue, and went ahead and made the changes to bootkube.sh on the bootstrapper node. I can see the service getting restarted, [root@ip-10-0-1-144 ~]# systemctl status bootkube.service
● bootkube.service - Bootstrap a Kubernetes cluster
Loaded: loaded (/etc/systemd/system/bootkube.service; static; vendor preset: disabled)
Active: active (running) since Wed 2019-01-16 09:13:40 UTC; 2min 10s ago
Main PID: 15506 (bash)
Memory: 17.7M
CGroup: /system.slice/bootkube.service
├─15506 bash /usr/local/bin/bootkube.sh
└─17231 podman run --rm --volume /var/opt/openshift:/assets:z --volume /etc/kubernetes:/etc/kubernetes:z --network=host quay.io/openshift-release-dev/ocp-v4.0@sha256:8e6fdc3f01...
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/openshift-service-signer-secret.yaml Secret openshift-service-cert-signer/service-serving-ce...eady exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/pull.json Secret kube-system/coreos-pull-secret: secrets "coreos-pull-secret" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-aggregator-client.yaml Secret openshift-kube-apiserver/aggregator-client: secrets "ag...eady exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-cluster-signing-ca.yaml Secret openshift-kube-controller-manager/cluster-signing-ca: ...eady exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-kubeconfig.yaml Secret openshift-kube-controller-manager/controller-manager-kubeconfi...eady exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-kubelet-client.yaml Secret openshift-kube-apiserver/kubelet-client: secrets "kubelet-...eady exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-service-account-private-key.yaml Secret openshift-kube-controller-manager/service-acc...eady exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-serving-cert.yaml Secret openshift-kube-apiserver/serving-cert: secrets "serving-cert...eady exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: NOTE: Bootkube failed to create some cluster assets. It is important that manifest errors are resolved and resubmitted to the apiserver.
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: For example, after resolving issues: kubectl create -f <failed-manifest>
Hint: Some lines were ellipsized, use -l to show in full.
[root@ip-10-0-1-144 ~]# journalctl -b -f -u bootkube.service
-- Logs begin at Tue 2019-01-15 10:06:43 UTC. --
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/openshift-service-signer-secret.yaml Secret openshift-service-cert-signer/service-serving-cert-signer-signing-key: secrets "service-serving-cert-signer-signing-key" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/pull.json Secret kube-system/coreos-pull-secret: secrets "coreos-pull-secret" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-aggregator-client.yaml Secret openshift-kube-apiserver/aggregator-client: secrets "aggregator-client" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-cluster-signing-ca.yaml Secret openshift-kube-controller-manager/cluster-signing-ca: secrets "cluster-signing-ca" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-kubeconfig.yaml Secret openshift-kube-controller-manager/controller-manager-kubeconfig: secrets "controller-manager-kubeconfig" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-kubelet-client.yaml Secret openshift-kube-apiserver/kubelet-client: secrets "kubelet-client" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-service-account-private-key.yaml Secret openshift-kube-controller-manager/service-account-private-key: secrets "service-account-private-key" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: Failed creating /assets/manifests/secret-serving-cert.yaml Secret openshift-kube-apiserver/serving-cert: secrets "serving-cert" already exists
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: NOTE: Bootkube failed to create some cluster assets. It is important that manifest errors are resolved and resubmitted to the apiserver.
Jan 16 09:14:06 ip-10-0-1-144 bootkube.sh[15506]: For example, after resolving issues: kubectl create -f <failed-manifest>
But how do I proced now? Do I need to run the installer again on the same directory on just go on the master and check the status of the operators? What do you suggest? |
These mean an earlier round of
You should cleanup and start over from scratch. The bootkube service will never recover from this situation, and it's likely that the cluster is missing important resources. You could probably force things though manually, but it's less work to just launch a fresh cluster. |
Version
Platform (aws|libvirt|openstack):
aws
What happened?
Hey Guys, I am trying to install one master and three worker nodes for OCP 4.0 on aws,
I can see the bootstrapper node and master nodes on the aws console.
On checking the
openshift-install.log
file I can see it failing here,Checking the bootstrap node I could see the bootkube.service in failed state reporting,
I can see that the ocp-release image getting pulled but I can see
@sha256
keyword getting repeated on it.Checking the
/usr/local/bin/bootkube.sh
file I cannot see any reference to the sha256 value but only could see the image tag which when manually pulled works fine.How can I can continue back the installation? Do I have to destroy the cluster and rebuild it?
Let me know if you are looking for more logs.
I can see this issue already reported
#2086
but this is happening with the latest installer as well.What you expected to happen?
How to reproduce it (as minimally and precisely as possible)?
$ your-commands-here
References
The text was updated successfully, but these errors were encountered: