-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make DominantResourceShare Compatible with Cohorts #4097
Make DominantResourceShare Compatible with Cohorts #4097
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
50fec32
to
daf9f5f
Compare
daf9f5f
to
f8cdd8e
Compare
/assign @PBundyra |
"sigs.k8s.io/kueue/pkg/resources" | ||
) | ||
|
||
type dominantResourceShareNode interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a description of the interface and its methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -646,84 +638,8 @@ func workloadBelongsToLocalQueue(wl *kueue.Workload, q *kueue.LocalQueue) bool { | |||
return wl.Namespace == q.Namespace && wl.Spec.QueueName == q.Name | |||
} | |||
|
|||
// The methods below implement several interfaces. See | |||
// dominantResourceShareNode, resourceGroupNode, and netQuotaNode. | |||
// Implements dominantResourceShareNode interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we assert this type implements the interface?
https://github.com/kubernetes-sigs/kueue/blob/main/pkg/controller/tas/topology_ungater.go#L77-L78
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a clear advantage to this pattern? Even without this check, the code will fail to compile if we forget one of the methods, since we use clusterQueue
as a dominantResourceShareNode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I had the impression that the code wouldn't fail during compilation, but rather during runtime. Thanks for clarifying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think indeed it does not matter for compilation, but can you quickly test if adding these improves integration with vs-code?
IIRC adding these was quickly letting me know which functions are missing (not implemented), but don't recall details, so it might be not accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked: there is no difference in the error message when adding these, versus when trying to use the type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return c.Name | ||
} | ||
|
||
// Implements dominantResourceShareNode interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
pkg/cache/clusterqueue_snapshot.go
Outdated
// DominantResourceShare returns a value from 0 to 1,000,000 representing the maximum of the ratios | ||
// of usage above nominal quota to the lendable resources in the cohort, among all the resources | ||
// provided by the ClusterQueue, and divided by the weight. | ||
// If zero, it means that the usage of the ClusterQueue is below the nominal quota. | ||
// The function also returns the resource name that yielded this value. | ||
// Also for a weight of zero, this will return 9223372036854775807. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this description be moved to dominantResourceShare
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/retest |
func (c *ClusterQueueSnapshot) usageFor(fr resources.FlavorResource) int64 { | ||
return c.ResourceNode.Usage[fr] | ||
} | ||
// The methods below implement hierarchicalResourceNode interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Since this is a clean up PR would you mind adding could you add description to |
pkg/cache/resource_node_test.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please follow the standard test pattern here, where we create separate type for test scenarios, and then run them in loop, so it's easily extendable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add this later, if we decide to extend the test. I don't think that it makes sense to make it extensible now, when there may only ever be one test (YAGNI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, do you anticipate we won't need more tests despite implementing FairSharing for Hierarchical Cohorts?
f8cdd8e
to
62edcbc
Compare
62edcbc
to
91c2f48
Compare
// provided by the ClusterQueue, and divided by the weight. | ||
// If zero, it means that the usage of the ClusterQueue is below the nominal quota. | ||
// The function also returns the resource name that yielded this value. | ||
// Also for a weight of zero, this will return 9223372036854775807. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Also for a weight of zero, this will return 9223372036854775807. | |
// Also for a weight of zero, this will return maxInt. |
/lgtm |
LGTM label has been added. Git tree hash: 9ffdbe429220e20fb0c5abad717939a3936a5716
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gabesaba, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Prepare for #3759 by making DominantResourceShare compatible with Cohorts. This is done by refactoring dominantResourceShare to use the hierarchicalResourceNode interface, which Cohort and CohortSnapshot implement.
Additionally, we do some simplifications, deletions, moves:
Special notes for your reviewer:
Please review commits independently for a more readable diff. The 2nd commit (should be) a pure move
Does this PR introduce a user-facing change?