-
Notifications
You must be signed in to change notification settings - Fork 4.8k
OCPNODE-3203: Add a test suite for kubeletconfig testing #30648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPNODE-3203: Add a test suite for kubeletconfig testing #30648
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
Skipping CI for Draft Pull Request. |
|
@ngopalak-redhat: This pull request references OCPNODE-3203 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@ngopalak-redhat: This pull request references OCPNODE-3203 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test all |
|
@ngopalak-redhat: This pull request references OCPNODE-3203 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test all |
|
Scheduling required tests: |
|
@cpmeadors Please review |
|
@ngopalak-redhat In tests that also need to config kubeletconfig or wait for machineconfig update, some of your functions in the kubeletconfig_features.go file can be reused. Should we consider to put the common functions in a utils.go or node_utils.go? CC: @cpmeadors |
|
/lgtm |
|
/assigne @neisw |
|
/assign @dgoodwin |
|
I was hoping we would end up with a general purpose disruptive suite as this comes up for multiple teams quite often. Is there a reason this could not be morphed to be so? Long running is still fine I think as our ability to run very long jobs is now improved last I heard. Out of curiosity how long are we talking for your current tests? |
This is a step in that direction. We wanted to get it working first, then adapt for more general usage.
@ngopalak-redhat can you answer this? |
|
I'd probably prefer we just name this to be what we want right from the get-go, rather than leaving yourselves a task to come back to later. Those kinds of things tend to not happen sometimes, but we could keep it kinda quiet while you get it where you want, then announce more broadly? I can avoid pointing anyone to it until you give the go-ahead. |
|
To further the point, the test names would need renaming because they contain the suite, which technically bloats all the dbs and should be reflected by a rename in the component-mapping repo, which is a pain. Job renames lose history as well. |
7678b6a to
3c2d92c
Compare
|
@dgoodwin / @cpmeadors I have chosen a generalized name for the test suite: openshift/disruptive-longrunning. Currently, the test I added runs serially and takes about 5 minutes. However, the Node team plans to add more comprehensive tests that may take up to 15 minutes and require multiple node restarts. Thinking of merging usernamespace test (https://github.com/openshift/origin/blob/main/test/extended/node/nested_container.go) also into this one. For example, the AutoSizingReserved feature (which you are aware of) requires multiple restarts to verify that the enable/disable logic functions correctly and that values are applied properly. |
|
Scheduling required tests: |
|
/approve |
|
/verified by @ngopalak-redhat |
|
@ngopalak-redhat: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cpmeadors, dgoodwin, ngopalak-redhat The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@ngopalak-redhat: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR introduces a new test suite specifically for node component testing, dubbed "Long Running Tests."
Following discussions with the MCO team on Slack, we agreed to separate specific disruptive node tests from the main MCO disruptive test suite.
Why not the Serial Suite? These tests cannot be part of the existing serial suite because they require multiple node reboots (exceeding the 3-restart limit) and have a significant runtime duration.
Goal is to establish a dedicated suite for tests that are disruptive and time-consuming but critical for release verification of the node component configuration (Kubelet).
Implementation Details
Adds the framework for long-running tests. Currently includes one test case: Changing the Kubelet Log Level.
Configs are applied to a single node rather than all three. This reduces the blast radius and significantly speeds up execution. The suite includes logic to revert changes and clean up the node state post-execution.
Future Work / Roadmap
Post-merge, I will add tests for system-compressible and auto-node-sizing.
I will work with the MCO team to configure this as a Periodic Job (similar to the MCO disruptive tests). These tests will not run on every PR but will be a Release Blocking requirement.
Sample run: https://gist.github.com/ngopalak-redhat/0c63bddf63a0a49c46c9dd2a13fad465