Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: release 31.20191123.0 #26

Closed
14 of 35 tasks
dustymabe opened this issue Nov 25, 2019 · 8 comments
Closed
14 of 35 tasks

testing: release 31.20191123.0 #26

dustymabe opened this issue Nov 25, 2019 · 8 comments

Comments

@dustymabe
Copy link
Member

dustymabe commented Nov 25, 2019

First, verify that you meet all the prerequisites

Pre-release

Promote testing-devel changes

From the checkout for fedora-coreos-config (replace upstream below with
whichever remote name tracks coreos/):

  • git fetch upstream
  • git checkout testing
  • git reset --hard upstream/testing
  • /path/to/fedora-coreos-releng-automation/scripts/promote-config.sh testing-devel
  • Sanity check promotion with git show
  • Open PR against the testing branch on https://github.com/coreos/fedora-coreos-config
  • Post a link to the PR as a comment to this issue
  • Ideally have at least one other person check it and approve
  • Once CI has passed, merge it

Build

Sanity-check the build

Using the the build browser for the testing stream:

  • Verify that the parent commit and version match the previous testing release (in the future, we'll want to integrate this check in the release job)
  • Check kola AWS run to make sure it didn't fail

⚠️ Release ⚠️

IMPORTANT: this is the point of no return here. Once the OSTree commit is
imported into the unified repo, any machine that manually runs rpm-ostree upgrade will have the new update.

Importing OSTree commit

In the future, the OSTree commit import will be integrated in the release job.

  • Open an issue on https://pagure.io/releng to ask for the OSTree commit to be imported (include a URL to the .sig which should be alongside the tarfile in the bucket and signed by the primary Fedora key)
  • Post a link to the issue as a comment in this issue
  • Wait for releng to process the request
  • Verify that the OSTree commit and its signature are present and valid by booting a VM at the previous release (e.g. cosa run -d /path/to/previous.qcow2) and verifying that rpm-ostree upgrade works and rpm-ostree status shows a valid signature.

Run the release job

  • Run the release job, filling in for parameters testing and the new version ID
  • Post a link to the job as a comment to this issue
  • Wait for job to finish

At this point, Cincinnati will see the new release on its next refresh and create a corresponding node in the graph without edges pointing to it yet.

Refresh metadata (stream and updates)

From a checkout of this repo:

  • Update stream metadata, by running:
fedora-coreos-stream-generator -releases=https://fcos-builds.s3.amazonaws.com/prod/streams/testing/releases.json  -output-file=streams/testing.json -pretty-print
  • Update the updates metadata, editing updates/testing.json:
    • Find the last-known-good release (whose rollout has a start_percentage of 100) and set its version to the most recent completed rollout
    • Delete releases with completed rollouts
    • Add a new rollout:
      • Set version field to the new version
      • Set start_epoch field to a future timestamp for the rollout start (e.g. date -d '2019/09/10 14:30UTC' +%s)
      • Set start_percentage field to 0.0
      • Set duration_minutes field to a reasonable rollout window (e.g. 2880 for 48h)
    • Update the last-modified field to current time (e.g. date -u +%Y-%m-%dT%H:%M:%SZ)

A reviewer can validate the start_epoch time by running date -u -d @<EPOCH>. An example of encoding and decoding in one step: date -d '2019/09/10 14:30UTC' +%s | xargs -I{} date -u -d @{}.

  • Commit the changes and open a PR against the repo.
  • Post a link to the PR as a comment to this issue
  • Wait for the PR to be approved.
  • Once approved, merge it and push the content to S3:
aws s3 sync --acl public-read --cache-control 'max-age=60' --exclude '*' --include 'streams/*' --include 'updates/*' . s3://fcos-builds
  • Verify the new version shows up on the download page
  • Verify the incoming edges are showing up in the update graph:
curl -H 'Accept: application/json' 'https://updates.coreos.stg.fedoraproject.org/v1/graph?basearch=x86_64&stream=testing&rollout_wariness=0'

NOTE: In the future, most of these steps will be automated and a syncer will push the updated metadata to S3.

@dustymabe
Copy link
Member Author

@dustymabe
Copy link
Member Author

coreos/fedora-coreos-config#236

CI didn't pass but I ran locally after kicking off another dist-repo and it's good

@dustymabe dustymabe changed the title testing: release 30.20191123.0 testing: release 31.20191123.0 Nov 25, 2019
@dustymabe
Copy link
Member Author

dustymabe commented Nov 25, 2019

@dustymabe
Copy link
Member Author

@dustymabe
Copy link
Member Author

OK we are going to abort this release because of an issue we identified where upgraded systems would be migrated from cgroups v1 to cgroups v2.

The hacky workaround here is for us to do another update of F30 FCOS that adds the systemd.unified_cgroup_hierarchy=0 karg to the BLS configs via a systemd unit. The BLS configs would then be consulted by rpm-ostree upgrade when upgrading to F31 and the systemd.unified_cgroup_hierarchy=0 would be there for the first boot of F31. We'd need to force all F30 upgrades through this last update as a barrier to F31 to make sure it has the arg there.

cc @lucab since this is the first time we'll be using barriers.

dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Nov 25, 2019
We found an issue [1] with our cgroups v2 strategy [2]. We need to
revert the recent promotions (that move us to F31) and do a new release
of F30 content with a slight modification. See the proposed solution
in the issue comment.

- Revert "tree: promote changes from testing-devel at 20e1222"
    - This reverts commit 0d8e188.
- Revert "manifest.yaml: bump to Fedora 31"
    - This reverts commit c11e08f.
- Revert "manifest.yaml: adapt for new path to fedora-coreos.yaml"
    - This reverts commit 01850ff.

[1] coreos/fedora-coreos-streams#26 (comment)
[2] coreos/fedora-coreos-tracker#292
@lucab
Copy link
Contributor

lucab commented Nov 26, 2019

@dustymabe I think the barrier logic may not be totally 100% right now (as we changed a few metadata details since last time I touched it), but I can do a pass on it in parallel while you make the last F30 release.

Do you have a new ETA for when you want to make the new F31? I'll try to find a slot to review and test barrier logic before that date. In the meanwhile, the last F30 can be started as a normal rollout and we'll add the barrier details at some point before starting the F31 rollout.

@dustymabe
Copy link
Member Author

hey @lucab - thanks for the info.

I think we were hoping to do the F30 release (we weren't going to change any packages, just add a systemd unit to do the workaround) and then the F31 release back to back.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Nov 26, 2019
In f31, the default cgroup changed to v2. However, we've decided to stay
on v1 for the time being. Thus, we don't want older nodes upgrading to
f31 to be forced into v2.

Add a tiny service which just scans the BLS configs and injects the
`systemd.unified_cgroup_hierarchy` karg as needed.

For more information, see:
coreos/fedora-coreos-tracker#292
coreos/fedora-coreos-streams#26 (comment)
jlebon pushed a commit to coreos/fedora-coreos-config that referenced this issue Nov 26, 2019
We found an issue [1] with our cgroups v2 strategy [2]. We need to
revert the recent promotions (that move us to F31) and do a new release
of F30 content with a slight modification. See the proposed solution
in the issue comment.

- Revert "tree: promote changes from testing-devel at 20e1222"
    - This reverts commit 0d8e188.
- Revert "manifest.yaml: bump to Fedora 31"
    - This reverts commit c11e08f.
- Revert "manifest.yaml: adapt for new path to fedora-coreos.yaml"
    - This reverts commit 01850ff.

[1] coreos/fedora-coreos-streams#26 (comment)
[2] coreos/fedora-coreos-tracker#292
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Nov 26, 2019
In f31, the default cgroup changed to v2. However, we've decided to stay
on v1 for the time being. Thus, we don't want older nodes upgrading to
f31 to be forced into v2.

Add a tiny service which just scans the BLS configs and injects the
`systemd.unified_cgroup_hierarchy` karg as needed.

For more information, see:
coreos/fedora-coreos-tracker#292
coreos/fedora-coreos-streams#26 (comment)
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Nov 26, 2019
In f31, the default cgroup changed to v2. However, we've decided to stay
on v1 for the time being. Thus, we don't want older nodes upgrading to
f31 to be forced into v2.

Add a tiny service which just scans the BLS configs and injects the
`systemd.unified_cgroup_hierarchy` karg as needed.

For more information, see:
coreos/fedora-coreos-tracker#292
coreos/fedora-coreos-streams#26 (comment)
@lucab
Copy link
Contributor

lucab commented Nov 27, 2019

@dustymabe backend logic should be good now. I manually tested a few corner-cases and everything seems to behave as expected. No further blockers on my side. I'll keep an eye on the graph once we put the first barrier in place.

jlebon added a commit to jlebon/fedora-coreos-streams that referenced this issue Nov 27, 2019
Note this is a barrier update due to:
coreos#26 (comment)
jlebon added a commit to jlebon/fedora-coreos-streams that referenced this issue Nov 27, 2019
Note this will be a barrier update due to:
coreos#26 (comment)

We'll add the barrier part later to be conservative.
jlebon added a commit that referenced this issue Nov 27, 2019
Note this will be a barrier update due to:
#26 (comment)

We'll add the barrier part later to be conservative.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants