Skip to content

Conversation

@smg247
Copy link
Member

@smg247 smg247 commented Nov 19, 2025

Reverts #85 ; tracked by TRT-2426

Per OpenShift policy, we are reverting this breaking change to get CI and/or nightly payloads flowing again.

Began failing in https://amd64.ocp.releases.ci.openshift.org/releasestream/4.21.0-0.nightly/release/4.21.0-0.nightly-2025-11-19-020220 on aggregated-gcp-ovn-rt-upgrade-4.21-minor and aggregated-aws-ovn-upgrade-4.21-micro-fips

To unrevert this, revert this PR, and layer an additional separate commit on top that addresses the problem. Before merging the unrevert, please run these jobs on the PR and check the result of these jobs to confirm the fix has corrected the problem:

/payload 4.21 nightly blocking

CC: @QiWang19

PR created by Revertomatic™️

…ft-cip"

This reverts commit 1f89a67, reversing
changes made to f4335a3.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 19, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 19, 2025

@smg247: This pull request references TRT-2426 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the ticket to target the "4.21.0" version, but no target version was set.

In response to this:

Reverts #85 ; tracked by TRT-2426

Per OpenShift policy, we are reverting this breaking change to get CI and/or nightly payloads flowing again.

Began failing in https://amd64.ocp.releases.ci.openshift.org/releasestream/4.21.0-0.nightly/release/4.21.0-0.nightly-2025-11-19-020220 on aggregated-gcp-ovn-rt-upgrade-4.21-minor and aggregated-aws-ovn-upgrade-4.21-micro-fips

To unrevert this, revert this PR, and layer an additional separate commit on top that addresses the problem. Before merging the unrevert, please run these jobs on the PR and check the result of these jobs to confirm the fix has corrected the problem:

/payload 4.21 nightly blocking

CC: @QiWang19

PR created by Revertomatic™️

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from sdodson and wking November 19, 2025 20:03
Copy link
Member

@wking wking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openshift/origin#30318 and openshift/origin#30480 were supposed to have made the test suite comfortable with the late-update ClusterImagePolicy rollout, but I guess we still have something more to do in that space. In the meantime, we want nightlies to keep getting accepted, so:

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 19, 2025
wking added a commit to wking/cluster-update-keys that referenced this pull request Nov 19, 2025
…-openshift-cip""

This reverts commit 7a5dcee.

This one has taken us some time:

* 2025-08-27, 94f7582, openshift#82 was our first attempt at enabling the
  ClusterImagePolicy.
* ...but it tripped up the origin test suite, so it was reverted in
  2025-08-28, c40e7b9, openshift#83.
* Qi then hardened the test suite with openshift/origin@d3af51e4acb
  (not fail upgrade checks if all nodes are ready, 2025-09-29,
  openshift/origin#30318) and openshift/origin@2fd0d8e242 (Upgrade
  test add 2min grace period allow non-drain updates to complete,
  2025-11-12, openshift/origin#30480).
* With the tougher CI in place, we tried a second time with
  2025-11-17, 1f89a67, openshift#85.
* ...but still tripped up origin, with runs like [1] taking 2.25m
  (more than the 2m grace period):

    I1119 17:26:21.890667 1511 upgrade.go:629] Waiting on pools to be upgraded
    I1119 17:26:21.939178 1511 upgrade.go:792] Pool master is still reporting (Updated: false, Updating: true, Degraded: false)
    I1119 17:26:21.939259 1511 upgrade.go:666] Invariant violation detected: master pool requires update but nodes not ready. Waiting up to 2m0s for non-draining updates to complete
    I1119 17:26:31.984116 1511 upgrade.go:792] Pool master is still reporting (Updated: false, Updating: true, Degraded: false)
    ...
    I1119 17:28:21.981438 1511 upgrade.go:792] Pool master is still reporting (Updated: false, Updating: true, Degraded: false)
    I1119 17:28:21.981514 1511 upgrade.go:673] Invariant violation detected: the "master" pool should be updated before the CVO reports available at the new version

  and:

    $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade/1991158541779472384/artifacts/e2e-gcp-ovn-rt-upgrade/gather-extra/artifacts/inspect/cluster-scoped-resources/machineconfiguration.openshift.io/machineconfigpools/master.yaml | yaml2json | jq -r '.status.conditions[] | select(.type == "Updating") | .lastTransitionTime + " " + .status'
    2025-11-19T17:28:36Z False

  28:36 - 26:21 = 135s = 2.25m, which overshot the 2m grace period.
  The second attempt was reverted in 7a5dcee, openshift#87.

* Qi then hardened the test suite further with
  openshift/origin@c17e560263 (Update grace period for cluster upgrade
  to 10 minutes, 2025-11-19, #openshift/origin#30506).
* This commit is taking a third attempt at enabling the
  ClusterImagePolicy.

[1]: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade/1991158541779472384
@jupierce
Copy link

/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 19, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jupierce, smg247, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 19, 2025
@wking
Copy link
Member

wking commented Nov 19, 2025

Still needs verified before it can merge. Personally, I'm hoping that the roll-forward openshift/origin#30506 comes through, but it is still ~4h before we have payload-aggregate results back there. If that payload-aggregate testing fails, I'll add verified here myself. I'm ok with other folks adding it earlier, or if payload-aggregate results show the origin pull as a sufficicent fix, but then I'll by bugging y'all for rerererevert labels on #89 ;)

@wking
Copy link
Member

wking commented Nov 19, 2025

/retest-required

@neisw
Copy link

neisw commented Nov 20, 2025

/verified by CI
/override ci/prow/e2e-aws

Plenty of Services should be rejected for evicted pods failures outside of this pr

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 20, 2025

@neisw: Overrode contexts on behalf of neisw: ci/prow/e2e-aws

In response to this:

/verified by CI
/override ci/prow/e2e-aws

Plenty of Services should be rejected for evicted pods failures outside of this pr

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 20, 2025
@openshift-ci-robot
Copy link
Contributor

@neisw: This PR has been marked as verified by CI.

In response to this:

/verified by CI
/override ci/prow/e2e-aws

Plenty of Services should be rejected for evicted pods failures outside of this pr

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 5505c03 into openshift:main Nov 20, 2025
5 checks passed
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 20, 2025

@smg247: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants