OCPBUGS-27264: Only reconcile on Node updates with Label changes#2206
Conversation
Reconciling every time any Node.Status.Condition changes means CNO just ends up constantly reconciling.
|
@danwinship: This pull request references Jira Issue OCPBUGS-27264, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
|
@danwinship: This pull request references Jira Issue OCPBUGS-27264, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
There was a problem hiding this comment.
/lgtm
cc @wizhaoredhat
wondering if this can be further reduced to only checking for the necessary labels but this LGTM.
CI seems to be passing for the gateway mode migration jobs
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship, tssurya The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
LGTM |
|
@danwinship: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest-required |
|
@danwinship: Jira Issue OCPBUGS-27264: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-27264 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/cherry-pick release-4.15 |
|
@danwinship: new pull request created: #2212 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
[ART PR BUILD NOTIFIER] This PR has been included in build cluster-network-operator-container-v4.16.0-202401191549.p0.ge53cc19.assembly.stream for distgit cluster-network-operator. |
|
Fix included in accepted release 4.16.0-0.nightly-2024-01-21-092529 |
e2e-aws-ovn-shared-to-local-gateway-mode-migrationand its opposite flake about 50% of the time withThis is because #1721 changed CNO to re-reconcile any time any Node object changes, but Nodes change a lot (specifically, their Conditions), so now we do a ton of unnecessary reconciling, causing CNO to be super busy and lag behind in processing events. This PR fixes it to only reconcile when node labels change.
(I don't know whether CNO lagginess causes any other problems currently?)