Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baremetal: Add strategy for upgrading CoreOS-based deploy image #879

Closed
wants to merge 4 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
268 changes: 268 additions & 0 deletions enhancements/baremetal/upgrade-coreos-deploy-image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
---
title: upgrade-coreos-deploy-image
authors:
- "@zaneb"
reviewers:
- "@hardys"
- "@dtantsur"
- "@elfosardo"
- "@sadasu"
- "@kirankt"
- "@asalkeld"
- "@cgwalters"
- "@cybertron"
- "@dhellmann"
zaneb marked this conversation as resolved.
Show resolved Hide resolved
approvers:
- "@hardys"
- "@sadasu"
creation-date: 2021-08-24
last-updated: 2021-08-24
status: implementable
see-also:
- "/enhancements/coreos-bootimages.md"
---

# Upgrades of the CoreOS-based deploy image

## Release Signoff Checklist

- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Operational readiness criteria is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary

To ensure that ironic-python-agent runs on top of an up-to-date OS, we will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe a footnote explaining for ironic-python-agent in case somebody from the rest of openshift reads this proposal.

update the CoreOS image URLs in the baremetal Provisioning CR to the latest
specified by the release metadata. For users running disconnected installs, we
will require them to make the latest versions available and block further
zaneb marked this conversation as resolved.
Show resolved Hide resolved
upgrades until they do so.

## Motivation

Currently, the deploy disk image (i.e. the image running IPA -
ironic-python-agent) is a RHEL kernel plus initrd that is installed (from an
RPM) into the `ironic-ipa-downloader` container image, which in turn is part of
the OpenShift release payload. When the metal3 Pod starts up, the disk image is
copied from the container to a HostPath volume whence it is available to
Ironic.

The provisioning OS disk image is a separate CoreOS QCOW2 image. The URL for
this is known by the installer. It points to the cloud by default and may be
customised by the user to allow disconnected installs. The URL is stored in the
Provisioning CR at install time and never updated automatically. The image
itself is downloaded once and permanently cached on all of the master nodes.
Never updating the image is tolerable because, upon booting, the CoreOS image
will update itself to the version matching the cluster it is to join. It
remains suboptimal because new Machines will take longer and longer (and more
and more bandwidth) to boot as the cluster ages, and also because support for
particular hardware may theoretically require a particular version of CoreOS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's not a theoretical problem, and it's much worse for baremetal than for other platforms because real hardware evolves much quicker than virtual environments.

(The former issue at least exists on all platforms, and this is the subject of
a [long-standing enhancement
proposal](https://github.com/openshift/enhancements/pull/201).)

We want to change the deploy disk image to use CoreOS. This may take the form
of both an ISO (for hosts that can use virtualmedia) and of a kernel + initrd +
rootfs (for hosts that use PXE). Like the provisioning disk image, the URLs for
these are known by the installer, but they point to the cloud by default and
may be customised by the user to allow disconnected installs. IPA itself is
delivered separately, as a container image as part of the OpenShift release
payload. We do not wish to continue maintaining or shipping the
ironic-ipa-downloader as part of the payload as well, since it (a) is huge and
(b) requires maintenance effort. This effectively extends the limitation that
we are not updating the provisioning OS image to include the deploy image as
well, although we will continue to be able to update IPA itself.

Once this is in place, we no longer need the QCOW2 image at all, since we can
zaneb marked this conversation as resolved.
Show resolved Hide resolved
‘provision’ by asking CoreOS in the deploy image to install itself (using
custom deploy steps in Ironic, exposed as a custom deploy method in Metal³).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The information in brackets is probably too detailed.

However, this requires updating any existing MachineSets, which is not planned
for the first release.
zaneb marked this conversation as resolved.
Show resolved Hide resolved

A naive approach would mean that upon upgrading from an existing cluster, we
would no longer have a guaranteed way of booting into _either_ deploy image:

* The existing deploy kernel + initrd will still exist on at least one master,
but may not exist on all of them, and not all that do exist are necessarily
the most recent version. Even if we found a way to sync them, we would have no
mechanism to update the image to match the current Ironic version, or fix
bugs, including security bugs.
* We have no way of knowing the URLs for the new deploy image, because they can
only be supplied at install time by the installer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 85-94 sound like they are describing a rejected alternative, and should go down in that section of the doc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context here is that all of the stuff described above was ready to land in 4.9 (and in fact a lot of it did, we just didn't flip the switch in the installer or remove ironic-ipa-downloader yet), but for the fact that we don't have a way to safely upgrade existing clusters.

So from the perspective of this enhancement, the lack of an upgrade path to the work we've already done is the problem, and the proposal is the solution. We could rewrite it so that the existing work is incorporated into the proposal and the actual stuff we need input on is bumped to the 'Upgrade Strategy' section, but that doesn't seem conducive to getting the best feedback.


### Goals

* Ensure that no matter which version of OpenShift a cluster was installed
with, we are able to deliver updates to IPA and the OS it runs on.
* Stop maintaining the non-CoreOS RHEL-based IPA image within 1-2 releases.
* Never break existing clusters, even if they are deployed in disconnected
environments.

### Non-Goals

* Automatically switch pre-existing MachineSets to deploy with `coreos-install`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: coreos-install is our internal concept that is not defined above. Maybe "... away from using qcow2 images"?

instead of via QCOW2 images.
* Update the CoreOS QCOW2 image in the cluster with each OpenShift release.
* Provide the CoreOS images as part of the release payload.

## Proposal

We will both ship the code to use the CoreOS image for IPA and continue to ship
the current ironic-ipa-downloader container image (which has the RHEL IPA image
built in) in parallel for one release to ensure no immediate loss of
functionality after an upgrade.

The release payload [includes metadata](/enhancements/coreos-bootimages.md)
that points to the CoreOS artifacts corresponding to the current running
release. This includes the QCOW2, ISO, kernel, initrd, and rootfs. The actual
images in use are defined in the Provisioning CR Spec. These are fixed at the
time of the initial installation, and may have been customised by the user for
installation in a disconnected environment. Since OpenShift 4.9 there are
fields for each of the image types/parts, although in clusters installed before
this enhancement is implemented, only the QCOW2 field
(`ProvisioningOSDownloadURL`) is set.

The cluster-baremetal-operator will verify the image URLs as part of
reconciling the Provisioning CR.

If any of the `PreprovisioningOSDownloadURLs` are not set and the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This implies new fields that are not introduced in this spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's an API change proposed here, it would be good to spell out the new field structure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields are already present in 4.9.

`ProvisioningOSDownloadURL` is set to point to the regular location (i.e. the
QCOW location has not been customised), then the cluster-baremetal-operator
zaneb marked this conversation as resolved.
Show resolved Hide resolved
will update the Provisioning Spec to use the latest images in the
`PreprovisioningOSDownloadURLs`.

If any of the `PreprovisioningOSDownloadURLs` are set to point to the regular
location (i.e. the ISO or kernel/initramfs/rootfs have not been customised) but
they point to a version that is not the latest, then the
cluster-baremetal-operator will update the Provisioning Spec to use the latest
images in the `PreprovisioningOSDownloadURLs`. Note that the kernel, initramfs
and rootfs must always be changed (or not) in lockstep.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these fields being updated? Spec or Status?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec.


The `ProvisioningOSDownloadURL` (QCOW2 link) will never be modified
automatically, since there may be MachineSets relying on it (indirectly, via
the image cache).

If the `ProvisioningOSDownloadURL` has been customised to point to a
non-standard location and any of the `PreprovisioningOSDownloadURLs` are not
set, the cluster-baremetal-operator will attempt to heuristically infer the
correct URLs. It will do so by substitution the release version and file
zaneb marked this conversation as resolved.
Show resolved Hide resolved
extension with the latest version and appropriate extension (respectively)
wherever those appear in the QCOW path. It will then attempt to verify the
existence of these files by performing an HTTP HEAD request to the generated
URLs. If the request succeeds, the cluster-baremetal-operator will update the
Provisioning Spec with the generated URL. If it fails, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems more natural to leave the Spec empty and update the Status to include the new default or derived values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Spec isn't empty, it's filled in by the installer. We are actually using the operator to override what the install put here. It is a bit icky, but it's a more accurate representation of what is happening (which is a bit icky).

cluster-baremetal-operator will report its status as incomplete. This will
zaneb marked this conversation as resolved.
Show resolved Hide resolved
prevent upgrading to the next release (which will *not* continue to ship
ironic-ipa-downloader as a backup), until such time as the user manually makes
the required images available.

If any of the `PreprovisioningOSDownloadURLs` have been customised to point to
a non-standard location and a version that is not the latest, the
cluster-baremetal-operator will perform the same procedure, except that the new
URL for each field will be the existing URL with the version replaced with the
latest one wherever it appears in the path. The status will be reported as
incomplete on failure, which means that users must provide the latest images at
a predictable location for every upgrade of the cluster.
zaneb marked this conversation as resolved.
Show resolved Hide resolved

### User Stories

As an operator of a disconnected cluster, I want to upgrade my cluster and have it to continue to work for provisioning baremetal machines.

As an operator of an OpenShift cluster, I want to add to my cluster new
hardware that was not fully supported in RHEL at the time I installed the
cluster.

As an operator of an OpenShift cluster, I want to ensure that the OS running on
hosts prior to them being provisioned as part of the cluster is up to date with
bug and security fixes.

An an OpenShift user, I want to stop downloading a extra massive image that is
separate to the one used for cluster members, and based on a different
distribution of RHEL, as part of the release payload.

### Risks and Mitigations

If the HEAD request succeeds but does not result in a valid image, we may
report success when in fact we will be unable to boot any hosts with the given
image. The machine-os-downloader container should do some basic due diligence
on the images it downloads.

## Design Details

### Open Questions

* Will marking the cluster-baremetal-operator as incomplete cause an upgrade to
be rolled back? Is there a better form of alert that will prevent a future
upgrade without rolling back the current one?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing triggers an automated rollback, IIUC. The CVO just keeps trying the upgrade until it succeeds. @sdodson can you verify that?

* Should we try to heuristically infer the URLs at all when they are missing,
or just require the user to manually set them?
* Is it acceptable to require users of disconnected installs to take action on
every upgrade? Is that better or worse than leaving them with an out-of-date
OS to run IPA on (since it will *not* be updated until actually provisioned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From support perspective, it introduces a version of CoreOS that is not current and is not known. If it causes a bug in IPA (e.g. unsupported hardware), debugging it will be problematic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the API to make it less necessary to guess the URLs? For example, could we ask for a URL base, to point to a directory in which files with names we control should be stored? So the user would give us https://my.private.server/openshift and we would add something like /4.y.z/coreos.kernel? That would avoid them having to update the URL on every upgrade, but would still let the operator report that it is Degraded if the necessary files do not exist on the server after the upgrade.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the installer's API and I think it's common to every platform, so I'm not sure that changing it would be welcomed.
In any event, that wouldn't help us with clusters that were installed in the past.

That would avoid them having to update the URL on every upgrade

Updating the URL ourselves is easy enough, it's the user actually putting the file where we need it that's the blocker. Being slightly more strict about renaming doesn't really move the needle much imho.

Ideally the CoreOS images would be provided automatically by the same tooling that mirrors the release payload (actually ideally the image would be in the release payload), but that's not where we are.

as a cluster node).

### Test Plan

We will need to test all of the following scenarios:

* Install with release N
* Upgrade from release N-1 -> N
* Simulated upgrade from release N -> N+1

The simulated upgrade will require modifying the image metadata inside the
release payload.

Furthermore, we will need to test each of these scenarios both with the default
image URLs and with custom URLs for disconnected installs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ought to be able to automate those as unit tests for the operator, right? We don't need end-to-end tests for all of those scenarios?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously we do need unit tests, but I think it would also be nice to know that all the moving parts actually work together the way we expect.


### Graduation Criteria

N/A

#### Dev Preview -> Tech Preview

N/A

#### Tech Preview -> GA

N/A

#### Removing a deprecated feature

N/A

### Upgrade / Downgrade Strategy

See... everything above.

### Version Skew Strategy

Changes will happen once both the new metadata and a version of the
cluster-baremetal-operator that supports this feature are present. The order in
which these appear is irrelevant, and changes will only have any discernable
effect on BaremetalHosts newly added or deprovisioned after the update anyway.

## Implementation History

N/A

## Drawbacks

Users operating disconnected installs will be required to manually make
available the latest CoreOS images on each cluster version upgrade.

## Alternatives

Don't try to keep the CoreOS image up to date with each release, and instead
require only that working images have been specified at least once.

Instead of upgrading, have CoreOS somehow try to update itself in place before
running IPA. (This is likely to be slow, and it's not clear that it is even
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very slow and memory-greedy. Won't work for disconnected installs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me wonder how disconnected installs update CoreOS (which is supposed to update itself before joining the cluster) now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow RHCOS gets written to disk, and then the system boots it from disk for the first time. At that point it upgrades the same way RHCOS gets updated in on an ongoing basis; it retrieves a particular container image from the payload that contains the current RHCOS ostree, applies that ostree to the filesystem on disk, then reboots again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was not aware that the ostree is already in the payload.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible since we will be running as a live ISO, not written to disk at this
point.)

Don't try to guess the locations of the images if they are not set, and require
the user to manually specify them.