Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Allow rpm-ostree rebase (or rpm-ostree install --apply-live) on live iso #4547

Open
JM1 opened this issue Aug 24, 2023 · 2 comments
Open
Labels
kind/enhancement triaged This issue was triaged

Comments

@JM1
Copy link

JM1 commented Aug 24, 2023

Host system details

$> rpm-ostree status
State: idle
Deployments:
● fedora:fedora/x86_64/coreos/stable
                  Version: 38.20230609.3.0 (2023-06-26T21:56:57Z)
                   Commit: 248366c65732b30ae0dbd96be8b75db46f08f428f68254ee14ac52cb39f82240
             GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464

Expected vs actual behavior

Please allow to run rpm-ostree rebase on a system which has been launched with a Live ISO. For example, when FCOS is booted with a Live ISO both / and /system are mounted read-only. This causes rpm-ostree rebase ... to fail with:

error: Remounting /sysroot read-write: Permission denied

If rpm-ostree rebase cannot be implemented for live isos, another option might be to support rpm-ostree install --apply-live instead.

Rationale

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes the first time during installation. FCOS does not provide tools such as OpenShift Client (oc) or hyperkube which are used during single-node cluster installation at first boot (e.g. oc in bootkube.sh). RHCOS and SCOS include these tools, but FCOS has to pivot the root fs to okd-machine-os first in order to make those tools available. Pivoting uses rpm-ostree rebase but during SNO installation the node will be booted from a FCOS Live ISO where / and /sysroot are read-only. Thus rpm-ostree rebase fails and necessary tools for SNO installation will not be available, causing the setup to stall.

Allowing rpm-ostree install --apply-live on live isos would allow users to extend their live environment in the same way as traditional installations. Either to install necessary tools such as hyperkube in the OKD/FCOS SNO use case above or auxiliary tools such as fzf or interpreters.

@cgwalters
Copy link
Member

I think I'd say we should add e.g. --apply-live-only or so, which would actually be useful for non-ISO cases too.

We could also default to having --apply-live mean --apply-live-only on physically readonly systems.

@cgwalters cgwalters added triaged This issue was triaged kind/enhancement labels Aug 25, 2023
JM1 added a commit to JM1/fedora-coreos-config that referenced this issue Oct 3, 2023
Previously, karg coreos.liveiso.fromram would cause live-generator to
copy rootfs.img to a tmpfs and then mount it to /sysroot. Because
rootfs.img contains a squashfs, /sysroot will be mounted read-only,
preventing rpm-ostree operations such as install and rebase which are
required by OKD/FCOS [0].

Now, with karg coreos.liveiso.fromram (Live ISO) or coreos.live.\
fromram (PXE boot) the rootfs.img will be mounted to /isoroot. The
contents of /isoroot will be copied to /run/ephemeral and the latter
will be bind-\ mounted to /sysroot. Because /run/ephemeral is a
writeable xfs, both sysroot-etc.mount and sysroot-var.mount are not
required in this case.

For example, to rebase a FCOS/OKD bootimage first boot a Live ISO
with Fedora 39 from RAM and then rebase and soft-reboot [1] (requires
systemd v254) it with:

    rpm-ostree rebase fedora:fedora/x86_64/coreos/next
    rpm-ostree apply-live --allow-replacement
    systemctl soft-reboot

[0] coreos/rpm-ostree#4547
[1] https://www.freedesktop.org/software/systemd/man/systemd-soft-reboot.service.html
@JM1
Copy link
Author

JM1 commented Oct 3, 2023

It looks like rpm-ostree already implements all necessary features to support rebasing a Live ISO ☺️ This PR for fedora-coreos-config changes the way FCOS accesses contents of a Live ISO. Together with systemd's soft-reboot this allows to rebase a running system.

JM1 added a commit to JM1/openshift-installer that referenced this issue Oct 27, 2023
…d Installer

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes
the first time during installation. FCOS does not provide tools such
as OpenShift Client (oc) or crio.service which Agent-based Installer
uses at the rendezvous host, e.g. to launch the bootstrap control
plane.

RHCOS and SCOS include these tools, but FCOS has to pivot the root fs
[1] to okd-machine-os [2] first in order to make those tools available.

Pivoting uses 'rpm-ostree rebase' but the rendezvous host is booted
the first time the node boots from a FCOS Live ISO where the root fs
and /sysroot are mounted read-only. Thus 'rpm-ostree rebase' fails and
necessary tools will not be available, causing the setup to stall.

Until rpm-ostree has implemented support for rebasing Live ISOs [3],
this patch adapts the workaround for SNO installations [4] to also
support Agent-based Installer.

In particular, the Go conditional {{- if .BootstrapInPlace }} which
is used to mark a SNO install has been replaced with a shell if-else
which checks at runtime whether the system is launched from are on a
Live ISO.
Most code in the OpenShift ecosystem is written with RHCOS in mind
and often assumes that tools like oc or crio.service are available.
These assumptions can be satisfied by applying this workaround to all
Live ISO boots. It will not remove functionality or overwrite
configuration files in /etc and thus side effects should be minimal.

The Go conditional {{- if .BootstrapInPlace }} in the release-image-\
pivot.service has been dropped completely. This service is only used
in OKD only, so OCP will not be impacted at all. The 'Before=' option
will not cause systemd to fail if a service does not exist. So, in
case bootkube.service or kubelet.service do not exist, the option will
have no effect.
When bootkube.service or kubelet.service do exist, it must always be
ensured that release-image-pivot.service is started first because it
might reboot the system or change /usr in the Live ISO use case.
So it is safe to drop the Go conditional and ask systemd to always
launch release-image-pivot.service before bootkube.service and
kubelet.service.

[0] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template
[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
[2] https://github.com/openshift/okd-machine-os
[3] coreos/rpm-ostree#4547
[4] openshift#7445
JM1 added a commit to JM1/openshift-installer that referenced this issue Nov 28, 2023
…d Installer

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes
the first time during installation. FCOS does not provide tools such
as OpenShift Client (oc) or crio.service which Agent-based Installer
uses at the rendezvous host, e.g. to launch the bootstrap control
plane.

RHCOS and SCOS include these tools, but FCOS has to pivot the root fs
[1] to okd-machine-os [2] first in order to make those tools available.

Pivoting uses 'rpm-ostree rebase' but the rendezvous host is booted
the first time the node boots from a FCOS Live ISO where the root fs
and /sysroot are mounted read-only. Thus 'rpm-ostree rebase' fails and
necessary tools will not be available, causing the setup to stall.

Until rpm-ostree has implemented support for rebasing Live ISOs [3],
this patch adapts the workaround for SNO installations [4] to also
support Agent-based Installer.

In particular, the Go conditional {{- if .BootstrapInPlace }} which
is used to mark a SNO install has been replaced with a shell if-else
which checks at runtime whether the system is launched from are on a
Live ISO.
Most code in the OpenShift ecosystem is written with RHCOS in mind
and often assumes that tools like oc or crio.service are available.
These assumptions can be satisfied by applying this workaround to all
Live ISO boots. It will not remove functionality or overwrite
configuration files in /etc and thus side effects should be minimal.

The Go conditional {{- if .BootstrapInPlace }} in the release-image-\
pivot.service has been dropped completely. This service is only used
in OKD only, so OCP will not be impacted at all. The 'Before=' option
will not cause systemd to fail if a service does not exist. So, in
case bootkube.service or kubelet.service do not exist, the option will
have no effect.
When bootkube.service or kubelet.service do exist, it must always be
ensured that release-image-pivot.service is started first because it
might reboot the system or change /usr in the Live ISO use case.
So it is safe to drop the Go conditional and ask systemd to always
launch release-image-pivot.service before bootkube.service and
kubelet.service.

[0] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template
[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
[2] https://github.com/openshift/okd-machine-os
[3] coreos/rpm-ostree#4547
[4] openshift#7445
aleskandro pushed a commit to aleskandro/installer that referenced this issue Nov 30, 2023
…d Installer

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes
the first time during installation. FCOS does not provide tools such
as OpenShift Client (oc) or crio.service which Agent-based Installer
uses at the rendezvous host, e.g. to launch the bootstrap control
plane.

RHCOS and SCOS include these tools, but FCOS has to pivot the root fs
[1] to okd-machine-os [2] first in order to make those tools available.

Pivoting uses 'rpm-ostree rebase' but the rendezvous host is booted
the first time the node boots from a FCOS Live ISO where the root fs
and /sysroot are mounted read-only. Thus 'rpm-ostree rebase' fails and
necessary tools will not be available, causing the setup to stall.

Until rpm-ostree has implemented support for rebasing Live ISOs [3],
this patch adapts the workaround for SNO installations [4] to also
support Agent-based Installer.

In particular, the Go conditional {{- if .BootstrapInPlace }} which
is used to mark a SNO install has been replaced with a shell if-else
which checks at runtime whether the system is launched from are on a
Live ISO.
Most code in the OpenShift ecosystem is written with RHCOS in mind
and often assumes that tools like oc or crio.service are available.
These assumptions can be satisfied by applying this workaround to all
Live ISO boots. It will not remove functionality or overwrite
configuration files in /etc and thus side effects should be minimal.

The Go conditional {{- if .BootstrapInPlace }} in the release-image-\
pivot.service has been dropped completely. This service is only used
in OKD only, so OCP will not be impacted at all. The 'Before=' option
will not cause systemd to fail if a service does not exist. So, in
case bootkube.service or kubelet.service do not exist, the option will
have no effect.
When bootkube.service or kubelet.service do exist, it must always be
ensured that release-image-pivot.service is started first because it
might reboot the system or change /usr in the Live ISO use case.
So it is safe to drop the Go conditional and ask systemd to always
launch release-image-pivot.service before bootkube.service and
kubelet.service.

[0] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template
[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
[2] https://github.com/openshift/okd-machine-os
[3] coreos/rpm-ostree#4547
[4] openshift#7445
JM1 added a commit to JM1/openshift-installer that referenced this issue Nov 30, 2023
…d Installer

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes
the first time during installation. FCOS does not provide tools such
as OpenShift Client (oc) or crio.service which Agent-based Installer
uses at the rendezvous host, e.g. to launch the bootstrap control
plane.

RHCOS and SCOS include these tools, but FCOS has to pivot the root fs
[1] to okd-machine-os [2] first in order to make those tools available.

Pivoting uses 'rpm-ostree rebase' but the rendezvous host is booted
the first time the node boots from a FCOS Live ISO where the root fs
and /sysroot are mounted read-only. Thus 'rpm-ostree rebase' fails and
necessary tools will not be available, causing the setup to stall.

Until rpm-ostree has implemented support for rebasing Live ISOs [3],
this patch adapts the workaround for SNO installations [4] to also
support Agent-based Installer.

In particular, the Go conditional {{- if .BootstrapInPlace }} which
is used to mark a SNO install has been replaced with a shell if-else
which checks at runtime whether the system is launched from are on a
Live ISO.
Most code in the OpenShift ecosystem is written with RHCOS in mind
and often assumes that tools like oc or crio.service are available.
These assumptions can be satisfied by applying this workaround to all
Live ISO boots. It will not remove functionality or overwrite
configuration files in /etc and thus side effects should be minimal.

The Go conditional {{- if .BootstrapInPlace }} in the release-image-\
pivot.service has been dropped completely. This service is only used
in OKD only, so OCP will not be impacted at all. The 'Before=' option
will not cause systemd to fail if a service does not exist. So, in
case bootkube.service or kubelet.service do not exist, the option will
have no effect.
When bootkube.service or kubelet.service do exist, it must always be
ensured that release-image-pivot.service is started first because it
might reboot the system or change /usr in the Live ISO use case.
So it is safe to drop the Go conditional and ask systemd to always
launch release-image-pivot.service before bootkube.service and
kubelet.service.

[0] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template
[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
[2] https://github.com/openshift/okd-machine-os
[3] coreos/rpm-ostree#4547
[4] openshift#7445
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/installer that referenced this issue Dec 12, 2023
…d Installer

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes
the first time during installation. FCOS does not provide tools such
as OpenShift Client (oc) or crio.service which Agent-based Installer
uses at the rendezvous host, e.g. to launch the bootstrap control
plane.

RHCOS and SCOS include these tools, but FCOS has to pivot the root fs
[1] to okd-machine-os [2] first in order to make those tools available.

Pivoting uses 'rpm-ostree rebase' but the rendezvous host is booted
the first time the node boots from a FCOS Live ISO where the root fs
and /sysroot are mounted read-only. Thus 'rpm-ostree rebase' fails and
necessary tools will not be available, causing the setup to stall.

Until rpm-ostree has implemented support for rebasing Live ISOs [3],
this patch adapts the workaround for SNO installations [4] to also
support Agent-based Installer.

In particular, the Go conditional {{- if .BootstrapInPlace }} which
is used to mark a SNO install has been replaced with a shell if-else
which checks at runtime whether the system is launched from are on a
Live ISO.
Most code in the OpenShift ecosystem is written with RHCOS in mind
and often assumes that tools like oc or crio.service are available.
These assumptions can be satisfied by applying this workaround to all
Live ISO boots. It will not remove functionality or overwrite
configuration files in /etc and thus side effects should be minimal.

The Go conditional {{- if .BootstrapInPlace }} in the release-image-\
pivot.service has been dropped completely. This service is only used
in OKD only, so OCP will not be impacted at all. The 'Before=' option
will not cause systemd to fail if a service does not exist. So, in
case bootkube.service or kubelet.service do not exist, the option will
have no effect.
When bootkube.service or kubelet.service do exist, it must always be
ensured that release-image-pivot.service is started first because it
might reboot the system or change /usr in the Live ISO use case.
So it is safe to drop the Go conditional and ask systemd to always
launch release-image-pivot.service before bootkube.service and
kubelet.service.

[0] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template
[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
[2] https://github.com/openshift/okd-machine-os
[3] coreos/rpm-ostree#4547
[4] openshift#7445
JM1 added a commit to JM1/openshift-installer that referenced this issue Jan 22, 2024
…d Installer

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes
the first time during installation. FCOS does not provide tools such
as OpenShift Client (oc) or crio.service which Agent-based Installer
uses at the rendezvous host, e.g. to launch the bootstrap control
plane.

RHCOS and SCOS include these tools, but FCOS has to pivot the root fs
[1] to okd-machine-os [2] first in order to make those tools available.

Pivoting uses 'rpm-ostree rebase' but the rendezvous host is booted
the first time the node boots from a FCOS Live ISO where the root fs
and /sysroot are mounted read-only. Thus 'rpm-ostree rebase' fails and
necessary tools will not be available, causing the setup to stall.

Until rpm-ostree has implemented support for rebasing Live ISOs [3],
this patch adapts the workaround for SNO installations [4] to also
support Agent-based Installer.

In particular, the Go conditional {{- if .BootstrapInPlace }} which
is used to mark a SNO install has been replaced with a shell if-else
which checks at runtime whether the system is launched from are on a
Live ISO.
Most code in the OpenShift ecosystem is written with RHCOS in mind
and often assumes that tools like oc or crio.service are available.
These assumptions can be satisfied by applying this workaround to all
Live ISO boots. It will not remove functionality or overwrite
configuration files in /etc and thus side effects should be minimal.

The Go conditional {{- if .BootstrapInPlace }} in the release-image-\
pivot.service has been dropped completely. This service is only used
in OKD only, so OCP will not be impacted at all. The 'Before=' option
will not cause systemd to fail if a service does not exist. So, in
case bootkube.service or kubelet.service do not exist, the option will
have no effect.
When bootkube.service or kubelet.service do exist, it must always be
ensured that release-image-pivot.service is started first because it
might reboot the system or change /usr in the Live ISO use case.
So it is safe to drop the Go conditional and ask systemd to always
launch release-image-pivot.service before bootkube.service and
kubelet.service.

[0] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template
[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
[2] https://github.com/openshift/okd-machine-os
[3] coreos/rpm-ostree#4547
[4] openshift#7445

(cherry picked from commit b2bbc85)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement triaged This issue was triaged
Projects
None yet
Development

No branches or pull requests

2 participants