-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hyperNode controller framework and provider #4014
base: network-topology
Are you sure you want to change the base?
Add hyperNode controller framework and provider #4014
Conversation
e3671f8
to
5c3c66d
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
77df7d9
to
2896479
Compare
bb42d94
to
df2089d
Compare
faa2068
to
d2fd538
Compare
Signed-off-by: Monokaix <changxuzheng@huawei.com>
d2fd538
to
59d50f6
Compare
please import copilot :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR introduces the hyperNode controller framework and provider to support the auto-discovery of network topology for hyperNodes, enabling vendors to seamlessly integrate their plugins with the Volcano controller.
- Added a new hyperNode controller implementation with basic lifecycle methods.
- Introduced a plugin-based hyperNode provider with dynamic loading and event handling.
- Updated related tests, installer YAMLs, and command-line options to integrate the new framework.
Reviewed Changes
File | Description |
---|---|
example/hypernode-provider/README.md | Added documentation for writing and building a hyperNode provider |
pkg/controllers/hypernode/hypernode_controller.go | Introduced the core hyperNode controller functions |
pkg/controllers/hypernode/hypernode_controller_test.go | Added tests for the hyperNode controller run logic |
example/hypernode-provider/example_provider.go | Provided an example implementation of a hyperNode provider |
pkg/controllers/hypernode/provider/provider_test.go | Added unit tests covering add, update, and delete event scenarios |
pkg/controllers/hypernode/provider/interface.go | Defined the provider plugin interface |
pkg/controllers/hypernode/provider/provider.go | Implemented the provider loading, event handling, and retry logic |
installer/volcano-development.yaml and installer/helm/chart/volcano/templates/controllers.yaml | Updated RBAC roles to include permissions for hyperNodes |
pkg/controllers/framework/interface.go, cmd/controller-manager/app/options/options.go, cmd/controller-manager/main.go, cmd/controller-manager/app/server.go | Updated server options and integration to support the new provider |
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
pkg/controllers/hypernode/provider/provider.go:220
- [nitpick] Function name 'handleNodeDeleted' is inconsistent with the naming of 'handleHyperNodeAdd' and 'handleHyperNodeUpdate'. Consider renaming it to 'handleHyperNodeDelete' for consistency.
func (p *provider) handleNodeDeleted(event Event) {
err := retry.OnError( | ||
backoff, | ||
func(err error) bool { | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In handleHyperNodeUpdate, the unconditional retry condition may lead to infinite retries on non-transient errors. Consider refining the retry condition to check for specific error types.
return true | |
return !apierrors.IsConflict(err) && !apierrors.IsInvalid(err) |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
} | ||
|
||
func (p *provider) loadProvider(dir string) error { | ||
pluginPaths, _ := filepath.Glob(fmt.Sprintf("%s/*.so", dir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error returned by filepath.Glob is ignored. Handling this error could prevent issues during plugin loading if globbing fails.
pluginPaths, _ := filepath.Glob(fmt.Sprintf("%s/*.so", dir)) | |
pluginPaths, err := filepath.Glob(fmt.Sprintf("%s/*.so", dir)) | |
if err != nil { | |
return fmt.Errorf("failed to glob plugins: %v", err) | |
} |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code quality is high as always, with only a few minor issues
|
||
# Install musl | ||
RUN apt-get update && \ | ||
apt-get install -y sudo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is sudo necessary? Is the default user in the container root?
RUN apt-get update && \ | ||
apt-get install -y sudo | ||
|
||
RUN wget http://musl.libc.org/releases/musl-1.2.1.tar.gz && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RUN wget http://musl.libc.org/releases/musl-1.2.1.tar.gz && \ | |
RUN wget http://musl.libc.org/releases/musl-latest.tar.gz && \ |
} | ||
|
||
func (p *provider) loadProvider(dir string) error { | ||
pluginPaths, _ := filepath.Glob(fmt.Sprintf("%s/*.so", dir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ignore the error?
func (p *provider) loadProvider(dir string) error { | ||
pluginPaths, _ := filepath.Glob(fmt.Sprintf("%s/*.so", dir)) | ||
for _, pluginPath := range pluginPaths { | ||
klog.InfoS("Loading provider plugin...", "", "path", pluginPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Infos
is used in some places, will this lead to too many logs by default?
return err | ||
} | ||
plugin := pb() | ||
//pluginName := getPluginName(pluginPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this if not used
What type of PR is this?
/kind feature
/area controllers
What this PR does / why we need it:
Add hyperNode controller provider and vendors can register plugins to add/update/delete hyperNodes by auto discovering network topology of their own datacenters.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?