-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrap NTO controller with controller runtime lib #316
wrap NTO controller with controller runtime lib #316
Conversation
@jmencak @dagrayvid |
As expected this PR seems to perform about the same as the current implementation:
|
// collection based locking scheme and we cannot continue until the ConfigMap | ||
// is GC or deleted manually. If the owner references do not exist, just go | ||
// ahead with the new client-go leader-election code. | ||
loop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmencak In case the move to the controller runtime wrapper does not handle this GC, will we still need to keep this cleanup for legacy NTO lock ?
What is the upgrade path for NTO ? can we skip major versions and miss this cleanup ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this cleanup code was added in 4.8. In my opinion we no longer need to keep it since any of the 4.8, 4.9, 4.10 versions will take care of this cleanup. In other words, we don's support direct upgrades from say 4.7->4.11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR, @yanirq ! I had a quick look at this and it looks mostly good to me. Only nits so far, but I will take another look. Also, @dagrayvid , can you take a look?
// collection based locking scheme and we cannot continue until the ConfigMap | ||
// is GC or deleted manually. If the owner references do not exist, just go | ||
// ahead with the new client-go leader-election code. | ||
loop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this cleanup code was added in 4.8. In my opinion we no longer need to keep it since any of the 4.8, 4.9, 4.10 versions will take care of this cleanup. In other words, we don's support direct upgrades from say 4.7->4.11.
The changes look good to me after my first look through. @yanirq @jmencak is the long term plan to maintain the NTO controller as a client-go based controller like this? Or do we intend to rewrite it using the controller-runtime eventually, assuming we can find a way to do so without a significant performance regression? |
The main goal here is to have NTO wrapped by controller runtime in order for us to add PAO, which is already written completely with controller-runtime, under this repository (see openshift/enhancements#867). With the changes introduced in this PR, the next step would be to add PAO controller with controller runtime manager:
|
So now that we have two options, I wonder how they'd compare performance-wise with PAO already merged in as a controller. Have you tried that, @yanirq? What's the expectation here? At the moment, this PR is the "winner", but will it still be a "winner" when running idle with PAO merged in? |
The approach of adding PAO would be similar in both PRs. We will add it as a separate controller (with its reconciler) under the manger. |
I had the same thought. I can probably test this using #314. |
@cynepco3hahue should have a more updated PR for that (at least a WIP one) |
/retest |
I will wait for an update to PR#314 before testing. Although I don't expect any surprises, I would like to do this test before merging this PR. |
This is a refactor to the main invocation of cluster node tuning controller. The controller and the metrics server are wrapped and started by the controller runtime library.
a93d87b
to
f906229
Compare
/retest |
@yanirq: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Thank you for the changes, @yanirq . This looks good to me now. Can we squash the commits? @dagrayvid , unless you have objections from your side, I think this is ready to be merged. |
@jmencak commits are already squashed |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jmencak, yanirq The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
This is a refactor to the main invocation of cluster node tuning controller. The controller and the metrics server are wrapped and started by the controller runtime library.
This is a refactor to the main invocation of cluster node tuning controller. The controller and the metrics server are wrapped and started by the controller runtime library.
Refactor cluster node tuning operator to use controller runtime library (release 0.11).
The functionality is internal only and replaces the direct application of a controller with the controller runtime scheme.
This is a refactor to the main invocation of cluster node tuning controller.
The controller and the metrics server are wrapped and started by the controller runtime library.
This is also a preliminary work to set up the stage for moving Performance addons operator under NTO as documented here: openshift/enhancements#867