Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept new image format, consume in controllerconfig, but stop there #3286

Conversation

jkyros
Copy link
Contributor

@jkyros jkyros commented Aug 15, 2022

As discussed in #3258, this:

  • allows the MCO to consume the new image format
  • adds the new rhel-coreos-8 and rhel-coreos-8-extensions images to the MCO's image-references file so after this they will show up in nightlies
  • propagates the new base image and extensions container into controllerconfig

Once this is in, and the MCO can accept the new argument, we will need to:

  • PR something like this jkyros/installer@cdf7858 to the installer to dump out the images file so the MCO can consume it during bootstrap

And then once that is in, we can go back to #3258 and decide if/when/how we want the MCO to use the new format image by default.

cgwalters and others added 6 commits August 15, 2022 15:11
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
The extensions for our layered/bootable `rhel-coreos` images are no
longer going to be contained within the image themselves. They will be a
separate container. See: openshift/os#763

This makes sure that, if present, the new extensions image gets pushed
through to controllerconfig and is available when merging
machineconfigs.
The MCO orchestrates the management/setup of a big pile of different
containers today, from `machine-os-content` to `haproxy` and
configuring cri-o to use the correct `pause`/`pod` container.

These images get set up at both "bootstrap" time and cluster time.
In bootstrap today we manually scrape each image out of the CVO in
shell script, and then pass them as CLI arguments inside `bootkube.sh`.

This means adding new images requires a tedious "ratcheting" process
where the CLI argument is added to the MCO, then we patch the installer
to pass it in.

Instead, add a new CLI argument which accepts a serialized imagestream
object, which is exactly what the CVO carries.

Prep for adding a new `rhel-coreos` image into the payload, so I can
avoid that ratcheting process.
So once upon a time, our image references in our manifests, the ones
that look like:

`registry.svc.ci.openshift.org/openshift:something`

probably pointed at real images. But:
- that registry has long since retired and
- the openshift client (`oc`) rewrites these values as though they were
templated when an openshift release gets packed by `oc adm release new`

This was not immediately apparent to someone who was not familiar with
the process, and an outside observer could be forgiven for thinking that
these were real values.

This tries to alleviate any confusion or belief that these values are
'real' by making them obviously fake, but also descriptive of their
function.
The way things work with the openshift client, once we add an image to
image-references, `oc` expects it to be in the release, and `oc adm
release new` will fail if it's not present.

Also, in order for the placeholder image location in our
`machine-config-osimageurl` configmap to get rewritten with a real value
(rather than the bogus 'template' value it has by default), it has to be
referenced in image-references.

The desired outcome was "if it's there use it, otherwise ignore it", but
that's just not possible with regard to the payload the way things are
set up.

This groups the addition of the new format os container and extensions
container together into sort of an "on switch" commit that will break
the builds if merged before https://issues.redhat.com/browse/ART-3883,
but will make them work with new images if merged afterwards.
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 15, 2022
@@ -23,6 +23,10 @@ spec:
from:
kind: DockerImage
name: registry.svc.ci.openshift.org/openshift:machine-os-content
- name: rhel-coreos
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this needs to be rhel-coreos-8 now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly bad form on my part -- I assume you're going commit-by-commit -- but I pulled that out when I added the extensions container, and then added them back in in the commit at the end (because in the old PR, that was what broke images when we didn't have the ART images, and I thought we might want to peel that one off)

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep sorry, I had missed I was going commit by commit. LGTM!

- name: keepalived-ipfailover
from:
kind: DockerImage
name: registry.svc.ci.openshift.org/openshift:keepalived-ipfailover
name: placeholder.url.oc.will.replace.this.org/placeholdernamespace:keepalived-ipfailover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely in favor of this explicit placeholder!

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 15, 2022

@jkyros: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cgwalters
Copy link
Member

Nice, it works!

$ oc adm release info registry.build02.ci.openshift.org/ci-op-0qhv8zrm/release:latest|grep rhel-coreos-8
  rhel-coreos-8                                  sha256:4c94cf372fceb03e6b29bd0c2166328acfece849d0bc08fc7bec510fa12dc637
  rhel-coreos-8-extensions                       sha256:35a61f80fbbf9208393ae16feda4fc26e75d69759010cff3421b01038f221bbd
$

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 16, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, jkyros

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit ebc8335 into openshift:master Aug 16, 2022
@LorbusChris
Copy link
Member

It looks like this adds an explicit dependency on the rhel-8-coreos and rhel-8-coreos-extensions imagestreamtags. I'm wondering how to handle this in OKD (both on FCOS and SCOS), and RHCOS9, where that naming scheme wouldn't make a lot of sense.
cc @vrutkovs

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request Aug 16, 2022
This reverts part of openshift#3286

While we did land `rhel-coreos-8` in both CI and ART/nightlies, the
corresponding extensions container is currently only in CI, not in
ART/nightly builds.  We're working on it in https://issues.redhat.com/browse/COS-1646
(Which will also need to have code changes from ART similar to
 https://issues.redhat.com/browse/ART-3883)
@cgwalters
Copy link
Member

how to handle this in OKD (both on FCOS and SCOS),

So...we may need to add a build-time conditional to the MCO for this. FCOS would probably just be fedora-coreos? I think this would involve generating the image-references file and a Go build tag or so. But that would have to be driven by...something like an ARG in the Dockerfile?

and RHCOS9, where that naming scheme wouldn't make a lot of sense.

My tentative plan here is to add rhel-coreos-9 - so hopefully you'd agree the naming scheme does make sense there.

So far we've avoided the operating system naming problem by calling everything related to that machine- (e.g. machine-os-content and machine-os-images) but my thinking here is in the context of https://github.com/openshift/enhancements/blob/master/enhancements/ocp-coreos-layering.md this indirection/hiding is motivated less because we are explicitly making the operating system a controllable thing from the user PoV.

@LorbusChris
Copy link
Member

I see, if the plan is to have multiple different images available eventually, this makes sense to me.
We use a Dockerfile ARG for OKD in the installer today, and we could certainly do the same for the MCO. We'll have to stop mirroring the MCO image from ocp to origin in Prow then, and do an additional build with the OKD ARG of the MCO directly into origin (or two really: one for FCOS, and one for SCOS).

enxebre added a commit to enxebre/hypershift that referenced this pull request Oct 4, 2022
When the CPO is at N and the NodePool.spec.release at N-1 we fail to render ignition payload because openshift/machine-config-operator#3286 broke backward compatibility.
This PRs lets it fall back to MCO previously supported flags.
enxebre added a commit to enxebre/hypershift that referenced this pull request Oct 4, 2022
When the CPO is at N and the NodePool.spec.release at N-1 we fail to render ignition payload because openshift/machine-config-operator#3286 broke backward compatibility.
This PRs lets it fall back to MCO previously supported flags.
enxebre added a commit to enxebre/hypershift that referenced this pull request Oct 4, 2022
When the CPO is at N and the NodePool.spec.release at N-1 we fail to render ignition payload because openshift/machine-config-operator#3286 broke backward compatibility.
This PRs lets it fall back to MCO previously supported flags.
enxebre added a commit to enxebre/hypershift that referenced this pull request Oct 4, 2022
When the CPO is at N and the NodePool.spec.release at N-1 we fail to render ignition payload because openshift/machine-config-operator#3286 broke backward compatibility. This PRs lets it fall back to MCO previously supported flags.
This commit ping the image used in the upgrade test to a no broken version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants