Rework build process to generate `rhel-coreos-base` distinct from `ocp-rhel-coreos` #799

cgwalters · 2022-05-11T13:48:54Z

Reworking RHEL CoreOS to be more like OKD and towards quay.io/openshift/node-base:rhel10

This pre-enhancement originated in this github issue.

A foundational decision in early on OpenShift 4 was to create RHEL CoreOS. Key
aspects of this were:

kubelet would not be containerized (negative experience with "system containers")
More crucially, we wanted to ship a tested combination of operating system and cluster
Also, the operating system updates should come in a container image

We're several years in now, and have learned a lot. This proposal calls for
reworking how we build things, but will avoid changing these key aspects.

Rework RHCOS disk images to not have OCP content

When we speak of RHEL CoreOS, there are two independent things at play:

disk images (AMI, qcow2, ISO, etc.)
OS update container

In this base proposal, the disk images shift to only RHEL content.

kubelet will not be in the AMI.
The version will change to something of the form $rhel.$datestamp, e.g. 9.2.20220510.1

Additionally, there will be a new container image called rhel-coreos-base that
will exactly match this.

These disk images will generally only be updated at the GA release of each RHEL, and will not contain security updates.

In phase 0, openshift-installer will continue to have rhcos.json. Disk images will continue to be provided at e.g. mirror.openshift.com.

However, the disk images will be much more likely to be shared across OCP releases in a bit for bit fashion.

machine-os-content/rhel-coreos-9

The key change here is that OCP content, including kubelet move into a container
image that derives from this base image. One can imagine it as the following Containerfile:

FROM rhel-coreos-base
RUN rpm-ostree install openshift-hyperkube

This is in fact currently done for OKD.

flowchart TD
    rpms[RHEL rpms] --> base[quay.io/openshift/rhel-coreos-base:9]-- Add kubelet, crio, openvswitch --> ocpnode[quay.io/openshift/rhel-coreos:9]

In phase 0, this new image will likely be built by the current CoreOS pipeline.

installer changes to always rebase/pivot from the disk image

Because OCP has not usually respun disk images for releases, at a technical level nodes always do an in-place OS update before kubelet starts.

In this new model, this is now also the time when kubelet gets installed.

The only exception to this today for OCP is the bootstrap node. The bootstrap node would switch to also doing an in-place update to the desired node image. This is how OKD works today.

flowchart LR
    installer[openshift-install] -->boot[RHEL base CoreOS disk image]-- pull quay.io/openshift/node:rhel10+reboot -->node[OCP node]

Phase 1 followups

Consider the above as a "phase 0" - a minimum set of changes to achieve a significant improvement without breaking things.

Create https://gitlab.com/redhat/coreos/base.git

A while ago, we created github.com/openshift/os to be the source of truth for RHCOS. But after phase 0 is done, conceptually there's nothing OCP specific about this. In order to align with RHEL, we could move into the https://gitlab.com/redhat project.

Images built with (or just mirroring) C9S composes

We can start producing images that exactly match a C9S compose; including mirroring version numbers.

github.com/openshift/node

It would make a huge amount of sense to also move the base systemd unit file into what is currently called rhel-coreos. The systemd unit currently lives in the MCO.

If we do the above gitlab/coreos/base.git change first, then this git repository could instead change to become openshift/node, and the systemd unit would perhaps live here (but maybe it should really be part of the RPM?)

Then, a next major step is to have this node image to be built the same way as any other OCP platform image, via Prow for CI and OSBS for production builds. This would significantly simplify the current RHCOS pipeline, and making it much more clear that it should align with RHEL lifecycles and technologies.

This may be a significant enough change on its own to call for renaming the OS image in the payload (yes, again) to just node, de-emphasizing "coreos".

The text was updated successfully, but these errors were encountered:

openshift-bot · 2022-08-10T01:01:04Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

travier · 2022-08-11T11:29:21Z

/remove-lifecycle stale

openshift-bot · 2022-11-10T01:00:41Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

sdodson · 2022-11-11T14:46:17Z

/lifecycle frozen

LorbusChris · 2023-02-28T20:55:25Z

We're looking at this from the OKD/SCOS side too.

The RPMs that I'd at a minimum like to split out from the base OS into a layer that is versioned together with the rest of the OpenShift codebase are the following:

conmon-rs
cri-o
cri-tools
openshift-clients
openshift-hyperkube
openvswitch (and NetworkManager-ovs)

cgwalters · 2023-02-28T23:19:47Z

I love the idea of trying this out first in OKD. We'd need to bikeshed implementation strategy...a whole lot involved in either path of coreos-assembler or ignoring coreos-assembler and going full native container builds via Dockerfile or a middle ground of trying to implement rpm-ostree compose image --from.

LorbusChris · 2023-02-28T23:56:15Z

To add to the bikeshedding, my first thought was that rpm-ostree install --from-manifest would be handy for this. It would be run during the container build and consume a manifest from os.

I think we'll also need to update the builds metadata with the layered container image artifact build, very similar to what cosa build-extensions does today, something like a cosa build-derive as a container build wrapper.

cgwalters · 2023-03-14T12:06:41Z

One giant benefit of this is that now it becomes immediately much better for OpenShift how to inject other code into the host system written in a compiled language. For example the code to manage the primary NIC via OVS is crying out to be...not bash.

The MCD today has a hack to copy itself to the host, which only dubiously works with skew between host and container userspace.

Basically the status quo makes no sense at all, where we embed a kubelet binary but inject all this other shell script and other logic. With this split, all that stuff would be consistently in a separate container image layer.

cgwalters · 2023-04-11T00:39:55Z

OK, I've updated the initial description in this issue with a bit more fleshed out description. Feedback appreciated!

cgwalters · 2023-04-12T16:58:26Z

One interesting example here is the SSH password bug.

If we'd already had this split, I think the change there would have landed in github.com/openshift/node - not in gitlab.com/redhat/coreos. We suddenly have a way to clearly distinguish the "stuff done for openshift nodes" versus "bootable rhel".

LorbusChris · 2023-09-15T23:40:36Z

This is strongly related to okd-project/okd-coreos-pipeline#46, which will split SCOS into a base and an OKD layer.

jlebon · 2024-02-16T22:45:15Z

/assign jlebon
/label jira

openshift-ci · 2024-02-16T22:45:19Z

@jlebon: The label(s) /label jira cannot be applied. These labels are supported: acknowledge-critical-fixes-only, platform/aws, platform/azure, platform/baremetal, platform/google, platform/libvirt, platform/openstack, ga, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, px-approved, docs-approved, qe-approved, no-qe, downstream-change-needed, rebase/manual, cluster-config-api-changed, approved, backport-risk-assessed, bugzilla/valid-bug, cherry-pick-approved, jira/valid-bug, staff-eng-approved. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/assign jlebon
/label jira

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

The Prow CI we have in those repos are extremely slow and annoying to maintain. We're still going to need it for now to at least build RHCOS with actual RHEL RPMs, but at least for CentOS Stream we should be able to build that fine in CoreOS CI. (We don't have access to the OCP RPMs, but with openshift/os#799, we'll move those out of the base compose anyway.)

As prep for openshift#799, let's better split the postprocessing steps that are related to OCP from those that have tighter binding to RHEL proper. This should have almost no functional effect. One visible difference is in the `/etc/motd` we write which before hardcoded e.g. RHCOS and CentOS Stream in the prose text, but is now a little more generic.

As prep for openshift#799, let's better split the postprocessing steps that are related to OCP from those that have tighter binding to RHEL proper. This should have no visible effect.

As part of openshift/os#799, we'll want to be able to run tests against the layered image. We want to be able to do that by pointing at the local file instead of having to push it to a registry first (which in the pipeline usually happens at the end).

As part of openshift/os#799, we'll want to build the "OCP node" image as a layered image on top of the RHCOS base image. Eventually, this image should be built outside our pipelines and more like the rest of OpenShift container images. But for now, let's build it ourselves. This allows us to prove out the idea without yet requiring changes in the rest of OpenShift. The script added here looks wordy, but it's really trivial. It's basically a glorified wrapper around `podman build` and `skopeo copy` so that the built OCI image ends up in our `meta.json`.

As part of openshift/os#799, we'll want to be able to run tests against the layered image. We want to be able to do that by pointing at the local file instead of having to push it to a registry first (which in the pipeline usually happens at the end).

As part of openshift/os#799, we'll want to build the "OCP node" image as a layered image on top of the RHCOS base image. Eventually, this image should be built outside our pipelines and more like the rest of OpenShift container images. But for now, let's build it ourselves. This allows us to prove out the idea without yet requiring changes in the rest of OpenShift. The script added here looks wordy, but it's really trivial. It's basically a glorified wrapper around `podman build` and `skopeo copy` so that the built OCI image ends up in our `meta.json`.

This is a first stab at openshift#799, aimed at the c9s variant to start. In this model, the base (container and disk) images we build in the pipeline do not contain any OCP-specific details. The compose is made up purely of RPMs coming out directly from the c9s pungi composes. Let's go over details of this in bullet form: 1. To emphasize the binding to c9s composes, we change the versioning scheme: the version string is now *exactly* the same version as the pungi compose from which we've built (well, we do add a `.N` field because we want to be able to rebuild multiple times on top of the same base pungi compose). It's almost like if our builds are part of the c9s pungi composes directly. (And maybe one day they will be...) This is implemented using a `versionary` script that queries compose info. 2. We no longer include `packages-openshift.yaml`: this has all the OCP stuff that we want to do in a layered build instead. 3. We no longer completely rewrite `/etc/os-release`. The host *is* image-mode CentOS Stream and e.g. `ID` will now say `centos`. However, we do still inject `VARIANT` and `VARIANT_ID` fields to note that it's of the CoreOS kind. We should probably actually match FCOS here and properly add a CoreOS variant in the `centos-release` package. 4. Tests which have to do with the OpenShift layer now have the required tag `openshift`. This means that it'll no longer run in the default set of kola tests. When building the derived image, we will run just those tests using `kola run --tag openshift --oscontainer ...`. Note that to make this work, OCP itself still needs to actually have that derived image containing the OCP bits. For now, we will build this in the pipelines (as a separate artifact that we push to the repos) but the eventual goal is that we'd split that out of the pipeline and have it be more like how the rest of OCP is built (using Prow/OSBS/Konflux). Note also we don't currently build the c9s variant in the pipelines but this is a long time overdue IMO.

jlebon · 2024-04-22T18:30:47Z

One tricky bit here worth highlighting is coreos/coreos-assembler@e174d84 (#3784).

Quoting from there:

And this actually highlights one of the gotchas of the layered OCP work
we'll have to keep an eye out for: it will not work to enable services
that only exist in the layered image via Ignition, because Ignition uses
presets for enablement and presets are only applied on first boot.

I think in general the way around this is to statically enable the
systemd unit instead as part of the layered image. But it just so
happens that in this particular case, we don't really need to do this
for crio.service anyway because even in OCP, it's not enabled via
Ignition.

I think openvswitch.service also falls in this bucket. It lives in the OCP layer only, is disabled by default, but is pulled in by ovs-configuration.service. (Though the MCO also explicitly enables it in some configurations, but that's redundant I think.)

As prep for openshift#799, let's better split the postprocessing steps that are related to OCP from those that have tighter binding to RHEL proper. This should have no visible effect.

This repo is really confusing to work with because of all the various tiers of variants we have. In practice, our production pipelines always specify a concrete variant to build because the switchover between e.g. 9.2 and 9.4 happens on the ART side, not RHCOS side. And even in CI, since the script that gets called by Prow lives here, we can easily control which concrete variant gets built. So overall, we don't gain much from trying to have symbolic versionless variants, but it adds cognitive overhead trying to understand it all. This patch greatly simplifies things by getting rid of the `scos` and `rhel-coreos-9` variants. Now, we *only* have concrete variants. Document them in the README. The only symbolic links left are the canonical variantless ones, which determine the default variant that gets built if no `--variant` switch is passed to `cosa init`. This is also prep for openshift#799, which will add more concrete variants.

This repo is really confusing to work with because of all the various tiers of variants we have. In practice, our production pipelines always specify a concrete variant to build because the switchover between e.g. 9.2 and 9.4 happens on the ART side, not RHCOS side. And even in CI, since the script that gets called by Prow lives here, we can easily control which concrete variant gets built. So overall, we don't gain much from trying to have symbolic versionless variants, but it adds cognitive overhead trying to understand it all. This patch greatly simplifies things by getting rid of the `scos` and `rhel-coreos-9` variants. Now, we *only* have concrete variants. Document them in the README. The only symbolic links left are the canonical variantless ones, which determine the default variant that gets built if no `--variant` switch is passed to `cosa init`. This is also prep for openshift#799, which will add more concrete variants that do not bake in the OpenShift components.

jlebon · 2024-05-06T21:13:42Z

Another tricky/hacky bit: see the first commit in #1503, where we have to do some dance to make usermod -a -G work in the container derivation flow. The TL;DR there is that nss-altfiles issues like coreos/rpm-ostree#1318 are just as relevant in the container flow as they were on the host.

We should probably add a line or two about usermod -a -G somewhere in https://containers.github.io/bootc/building/users-and-groups.html#nss-altfiles.

This repo is really confusing to work with because of all the various tiers of variants we have. In practice, our production pipelines always specify a concrete variant to build because the switchover between e.g. 9.2 and 9.4 happens on the ART side, not RHCOS side. And even in CI, since the script that gets called by Prow lives here, we can easily control which concrete variant gets built. So overall, we don't gain much from trying to have symbolic versionless variants, but it adds cognitive overhead trying to understand it all. This patch greatly simplifies things by getting rid of the `scos` and `rhel-coreos-9` variants. Now, we *only* have concrete variants. Document them in the README. The only symbolic links left are the canonical variantless ones, which determine the default variant that gets built if no `--variant` switch is passed to `cosa init`. This is also prep for openshift#799, which will add more concrete variants that do not bake in the OpenShift components.

This Containerfile allows us to build the OpenShift node image on top of the base RHCOS/SCOS image (i.e. built from the `c9s` or `rhel-9.4` image). Currently, the resulting image is at parity with the base image you'd get from building the `okd-c9s` or `ocp-rhel-9.4` variant. In the future, those variants will go away and this will become the only way to build the node image. Part of: openshift#799

jlebon · 2024-06-07T15:36:37Z

Initial work for this landed in #1445.

This is now an OpenShift enhancement: openshift/enhancements#1637. Let's track this there instead.

This was referenced May 11, 2022

proposal: generate "base rhel" container image, build OCP on top #498

Closed

manifests: Add RHEL 9.0 based RHCOS and SCOS #773

Closed

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 10, 2022

openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 11, 2022

cgwalters mentioned this issue Oct 27, 2022

c9s: Add distribution-gpg-keys rpm from EPEL #1032

Closed

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 10, 2022

openshift-ci bot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 11, 2022

This was referenced Mar 22, 2023

Block RHCOS gcp-routes service on both masters and workers openshift/machine-config-operator#3619

Merged

OCPBUGS-9982: bootstrap-pivot: skip pivot in SCOS Live ISO openshift/installer#6965

Merged

jlebon mentioned this issue Feb 16, 2024

cmd-build: add --versionary switch coreos/coreos-assembler#3735

Merged

openshift-ci bot assigned jlebon Feb 16, 2024

jlebon added the jira label Feb 16, 2024

jlebon mentioned this issue Feb 20, 2024

jobs/seed-github-ci: add openshift/os coreos/coreos-ci#60

Merged

jlebon mentioned this issue Mar 11, 2024

Add cosa buildextend-layered coreos/coreos-assembler#3753

Closed

jlebon mentioned this issue Apr 22, 2024

Various test tweaks to make compatible with --oscontainer coreos/coreos-assembler#3784

Merged

jlebon mentioned this issue May 6, 2024

NO-JIRA: variants: simplify #1502

Merged

jlebon mentioned this issue May 9, 2024

ext.config.systemd.journal-compat failing on SCOS in Prow #1505

Closed

jlebon closed this as completed Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework build process to generate `rhel-coreos-base` distinct from `ocp-rhel-coreos` #799

Rework build process to generate `rhel-coreos-base` distinct from `ocp-rhel-coreos` #799

cgwalters commented May 11, 2022 •

edited

Loading

openshift-bot commented Aug 10, 2022

travier commented Aug 11, 2022

openshift-bot commented Nov 10, 2022

sdodson commented Nov 11, 2022

LorbusChris commented Feb 28, 2023

cgwalters commented Feb 28, 2023

LorbusChris commented Feb 28, 2023

cgwalters commented Mar 14, 2023

cgwalters commented Apr 11, 2023

cgwalters commented Apr 12, 2023

LorbusChris commented Sep 15, 2023

jlebon commented Feb 16, 2024

openshift-ci bot commented Feb 16, 2024

jlebon commented Apr 22, 2024

jlebon commented May 6, 2024

jlebon commented Jun 7, 2024

Rework build process to generate rhel-coreos-base distinct from ocp-rhel-coreos #799

Rework build process to generate rhel-coreos-base distinct from ocp-rhel-coreos #799

Comments

cgwalters commented May 11, 2022 • edited Loading

Reworking RHEL CoreOS to be more like OKD and towards quay.io/openshift/node-base:rhel10

Rework RHCOS disk images to not have OCP content

machine-os-content/rhel-coreos-9

installer changes to always rebase/pivot from the disk image

Phase 1 followups

Create https://gitlab.com/redhat/coreos/base.git

Images built with (or just mirroring) C9S composes

github.com/openshift/node

openshift-bot commented Aug 10, 2022

travier commented Aug 11, 2022

openshift-bot commented Nov 10, 2022

sdodson commented Nov 11, 2022

LorbusChris commented Feb 28, 2023

cgwalters commented Feb 28, 2023

LorbusChris commented Feb 28, 2023

cgwalters commented Mar 14, 2023

cgwalters commented Apr 11, 2023

cgwalters commented Apr 12, 2023

LorbusChris commented Sep 15, 2023

jlebon commented Feb 16, 2024

openshift-ci bot commented Feb 16, 2024

jlebon commented Apr 22, 2024

jlebon commented May 6, 2024

jlebon commented Jun 7, 2024

Rework build process to generate `rhel-coreos-base` distinct from `ocp-rhel-coreos` #799

Rework build process to generate `rhel-coreos-base` distinct from `ocp-rhel-coreos` #799

cgwalters commented May 11, 2022 •

edited

Loading