4.16.0-okd-scos.1 Sinlge node installation not working properly #27

LennertMertens · 2024-11-25T10:12:22Z

When installing a single node OKD cluster, the installation is not properly proceeding because the oc command seems to be missing in the image. I manually installed the oc binary but when the installation continues, it reports an unhealthy cluster state.

Nov 25 10:02:57 control-plane podman[3607]: 2024-11-25 10:02:57.672576541 +0000 UTC m=+0.030907715 image pull ac3f3dae3c06d7338ff9481caab6fa7f958b52094f1825c8d4da80d47d16876f quay.io/okd/scos-release@sha256:06ffff6c6951046d03df0784bc18132c368a84fe72bcfb529484a58872c3a2e1
Nov 25 10:02:57 control-plane podman[3634]: 2024-11-25 10:02:57.818192064 +0000 UTC m=+0.049742304 container remove 5789219a171912cd0e2d116fff3f8dc10cf38205284509f2672877dbc9a8d7b4 (image=quay.io/okd/scos-release@sha256:06ffff6c6951046d03df0784bc18132c368a84fe72bcfb529484a58872c3a2e1, name=competent_leakey, io.openshift.release=4.16.0-okd-scos.1, io.openshift.release.base-image-digest=sha256:7d8d6875c9e8c9aa0eab546f354b92555a6c7621393a1ea98da4ecbf29e263e3)
Nov 25 10:02:57 control-plane bootkube.sh[2944]: Moving OpenShift manifests in with the rest of them
Nov 25 10:02:57 control-plane bootkube.sh[3666]: /usr/local/bin/bootkube.sh: line 81: oc: command not found
Nov 25 10:02:57 control-plane systemd[1]: bootkube.service: Main process exited, code=exited, status=127/n/a
Nov 25 10:02:57 control-plane systemd[1]: bootkube.service: Failed with result 'exit-code'.
Nov 25 10:02:57 control-plane systemd[1]: bootkube.service: Consumed 2.572s CPU time.
Nov 25 10:03:03 control-plane systemd[1]: bootkube.service: Scheduled restart job, restart counter is at 1.
Nov 25 10:03:05 control-plane systemd[1]: Started bootkube.service - Bootstrap a Kubernetes cluster.

The text was updated successfully, but these errors were encountered:

bshephar · 2024-11-26T11:27:36Z

This usually only happens until the node pulls down the rpm-ostree image for the version you're installing. Since we start the bootstrap with Fedora CoreOS and then use rpm-ostree to rebase on CentOS Stream CoreOS. Then the installation completes. So typically, this error is only transient until the node is rebooted from the SCOS image and then at that time, it will have the tools available.

I think we would need the must-gather from that node to definitively say whether or not this is the case in your environment.

BeardOverflow · 2024-11-28T21:52:55Z

@bshephar It is not a pulling error.

rpm-ostree does not work in certain scenarios like coreos/rpm-ostree#4547

This is a blocking error because it does never pivot to SCOS.

Long discussed here: okd-project/okd#2041

BeardOverflow mentioned this issue Nov 28, 2024

Unable to install SNO on baremetal: issue with bootkube.service and release-image-pivot.service #25

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4.16.0-okd-scos.1 Sinlge node installation not working properly #27

4.16.0-okd-scos.1 Sinlge node installation not working properly #27

LennertMertens commented Nov 25, 2024

bshephar commented Nov 26, 2024

BeardOverflow commented Nov 28, 2024

4.16.0-okd-scos.1 Sinlge node installation not working properly #27

4.16.0-okd-scos.1 Sinlge node installation not working properly #27

Comments

LennertMertens commented Nov 25, 2024

bshephar commented Nov 26, 2024

BeardOverflow commented Nov 28, 2024