-
Notifications
You must be signed in to change notification settings - Fork 76
Conversation
Marking as WIP for now. I have to split out lots of commits into separate prep PRs first. |
Only glanced at this but I think we want |
Yeah, I'm open to tweaking the current interface. My reasoning for making So, to flip this around: |
Another way to look at this is, |
OK, that's convincing yes. |
014b95a
to
36b6694
Compare
OK, split out prep patches in #1170! |
e9816e3
to
0d23362
Compare
run-upgrade
commandrun-upgrade
command
c8f5d83
to
a5bf6d8
Compare
OK, this now works on top of #1170! I added streaming decompression and signature verification to make There's definitely a lot more we could do on top of this, though I think it's good enough for now to get in as is, so we can at least start using it in the pipeline. Some follow-up improvements:
|
a5bf6d8
to
4e7ed9f
Compare
And while we're there, rename the functions to be more descriptive. This is prep for doing streaming decompression and GPG verification of downloaded qemu images.
That way a caller that wants to use the streaming interface can also just pass `""` as the key file to get the default keyring.
This function will download a compressed file and decompress it and verify the signature in a streaming fashion. What we lose in return is the ability to resume file downloads if we're interrupted. I think that trade-off is worth it though for a faster and more efficient common case.
This adds a new `run-upgrade` command focused on running upgrade tests. It also adds a single test in that testsuite: `fcos.upgrade.basic`. To run this test, one can do: ``` kola run-upgrade -v \ --cosa-build /path/to/meta.json \ --qemu-image /path/to/starting-image.qcow2 ``` You can tell kola to automatically detect the parent image to start from: ``` kola run-upgrade -v \ --cosa-build /path/to/meta.json \ --find-parent-image ``` For FCOS, this will fetch the metadata for the latest release for the target stream. On AWS, it will use the AMI from there as the starting image. On the QEMU platform, it will download the QEMU image locally (with signature verification). The code is extensible to add support for RHCOS and other target platforms. Why make it a separate command from `run`? Multiple reasons: 1. As shown above, it's about multiple artifacts, not just the system under test. By contrast, `run` is largely about using a single artifact input. For example, on AWS, `--aws-ami` points to the *starting* image, and `--cosa-build` points to the target upgrade. 2. It's more expensive than other tests. To make it truly cross-platform and self-contained, it works by pushing the OSTree content to the node and serving it from there to itself. Therefore, it's not a test that developers would necessarily be interested in running locally very often (though it's definitely adapted for local tests too when needed). 3. Unlike `run`, it has some special semantics like `--find-parent-image` to make it easier to use. Now, this is only part of the FCOS upgrade testing story. Here's roughly how I see this all fit together: 1. The FCOS pipeline runs `kola run-upgrade -p qemu` and possibly `kola run-upgrade -p aws` after the basic `kola run` tests have passed. 2. Once the build is clean and pushed out to S3, its content will be imported into the annex/compose repo. 3. Once there, we can do more realistic tests by targeting the annex repo and a dedicated Cincinnati. For example, we can have canary nodes following those updates that started from various previous releases to catch any state-dependent issues. Another more explicit approach is a test that starts those nodes at the select releases and gate new releases on that test. Essentially, the main advantage of this test is that we can do some upgrade testing *before* pushing out any bits at all to S3. The major bug category this is intended to catch are state-dependent ones (i.e. anything that *isn't* captured by the OSTree commit). However, it does also exercise many of the major parts of the update system (zincati, rpm-ostree, ostree, libcurl). Though it's clearly not a replacement for more realistic e2e tests downstream.
4e7ed9f
to
0cf89ca
Compare
Rebased! |
Start running the new upgrade test right after building the QEMU image. In the AWS test job, run the upgrade test on AWS in parallel. For more information, see: coreos/mantle#1168
} | ||
kola.QEMUOptions.DiskImage = decompressedQcowLocal | ||
case "aws": | ||
kola.AWSOptions.AMI, err = parentCosaBuild.FindAMI(kola.AWSOptions.Region) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also openshift/installer#2906 - we'll likely at some point need to copy the code from the installer to make images from storage; which gets into terraform vs something else here, or forking out to openshift-install instantiate-coreos-image
or someting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall! Very nice code.
Would be nice probably to support the no-zincati case for RHCOS...or OTOH we could just add zincati to RHCOS and leave it disabled by default.
Thanks for the review!
Yeah, I left space for this enhancement in the code. I guess the closer equivalent would be to just upload the oscontainer into the image storage, then write it to Gonna merge this one now! I'd like to get this and coreos/fedora-coreos-pipeline#190 in before the next stable release (which should be this week). |
Start running the new upgrade test right after building the QEMU image. In the AWS test job, run the upgrade test on AWS in parallel. For more information, see: coreos/mantle#1168
Start running the new upgrade test right after building the QEMU image. In the AWS test job, run the upgrade test on AWS in parallel. For more information, see: coreos/mantle#1168
Start running the new upgrade test right after building the QEMU image. In the AWS test job, run the upgrade test on AWS in parallel. For more information, see: coreos/mantle#1168
This adds a new
run-upgrade
command focused on running upgrade tests.It also adds a single test in that testsuite:
fcos.upgrade.basic
.To run this test, one can do:
You can tell kola to automatically detect the parent image to start
from:
For FCOS, this will fetch the metadata for the latest release for the
target stream. On AWS, it will use the AMI from there as the starting
image. On the QEMU platform, it will download the QEMU image locally
(with signature verification). The code is extensible to add support for
RHCOS and other target platforms.
Why make it a separate command from
run
? Multiple reasons:under test. By contrast,
run
is largely about using a singleartifact input. For example, on AWS,
--aws-ami
points to thestarting image, and
--cosa-build
points to the target upgrade.and self-contained, it works by pushing the OSTree content to the
node and serving it from there to itself. Therefore, it's not a test
that developers would necessarily be interested in running locally
very often (though it's definitely adapted for local tests too when
needed).
run
, it has some special semantics like--find-parent-image
to make it easier to use.Now, this is only part of the FCOS upgrade testing story. Here's roughly
how I see this all fit together:
kola run-upgrade -p qemu
and possiblykola run-upgrade -p aws
after the basickola run
tests havepassed.
imported into the annex/compose repo.
repo and a dedicated Cincinnati. For example, we can have canary
nodes following those updates that started from various previous
releases to catch any state-dependent issues. Another more explicit
approach is a test that starts those nodes at the select releases and
gate new releases on that test.
Essentially, the main advantage of this test is that we can do some
upgrade testing before pushing out any bits at all to S3. The major
bug category this is intended to catch are state-dependent ones (i.e.
anything that isn't captured by the OSTree commit).
However, it does also exercise many of the major parts of the update
system (zincati, rpm-ostree, ostree, libcurl). Though it's clearly not
a replacement for more realistic e2e tests downstream.