-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refine sysadvisor #86
refine sysadvisor #86
Conversation
cmd/katalyst-agent/app/options/sysadvisor/qosaware/resource/cpu/cpu_advisor.go
Outdated
Show resolved
Hide resolved
@@ -140,6 +158,20 @@ type RegionEntries map[string]*RegionInfo | |||
// PodSet stores container names keyed by pod uid | |||
type PodSet map[string]sets.String | |||
|
|||
// InternalCalculationResult conveys minimal information to cpu server for composing | |||
// calculation result | |||
type InternalCalculationResult struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ControlKnob contains ControlKnobName more than cpuset, what should we do with InternalCalculationResult? do we need to redefine this structure again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No in the short term, because cpu plugin only supports cpuset tweak. Even if we need to extend for more control knobs, adding more data members in InternalCalculationResult works.
pkg/agent/sysadvisor/plugin/qosaware/resource/cpu/region/provisionpolicy/policy.go
Show resolved
Hide resolved
} | ||
|
||
// calculate headroom of non binding numas according to the corresponding reclaim pool entry | ||
nonBindingNumasHeadroom, ok := ha.calculationResult.GetPoolEntry(state.PoolNameReclaim, cpuadvisor.FakedNUMAID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strongly not suggest to pass calculationResult this way, it's better to be stored in meta cache and fetch explicitly here, it may cause concurrent read and write, and it may cause abuse and nobody knows how to maintain this info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to get reclaim pool size from pool info in metacache, with a period delay compared to calculation result, ignorable under sliding window.
@@ -44,7 +40,7 @@ type QoSRegionDedicatedNumaExclusive struct { | |||
func NewQoSRegionDedicatedNumaExclusive(ci *types.ContainerInfo, conf *config.Configuration, numaID int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, the difference between dedicated region and shared region, is only the initialization logic (to get previous cpu requirement from meta cache)? will these two regions differ in other functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- they have different control knobs: reclaimed cpu supplied v.s. non reclaimed cpuset size
- focus on different system indicator: cpi & mem latency & membw v.s. pool schedwait
- dynamic system indicator target takes effect only for dedicated region
- other minor difference under rama/borwein policy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- they have different control knobs: reclaimed cpu supplied v.s. non reclaimed cpuset size
- focus on different system indicator: cpi & mem latency & membw v.s. pool schedwait
- dynamic system indicator target takes effect only for dedicated region
- other minor difference under rama/borwein policy
okay, so those will be implemented along with rama in another pr, not for this one, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. for now these regions seem similar.
bb97c52
to
84f23db
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #86 +/- ##
==========================================
+ Coverage 51.30% 52.86% +1.55%
==========================================
Files 318 354 +36
Lines 32418 34529 +2111
==========================================
+ Hits 16632 18253 +1621
- Misses 13840 14146 +306
- Partials 1946 2130 +184
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
84f23db
to
2e4fbb3
Compare
2e4fbb3
to
d16c0da
Compare
* refactor(sysadvisor): refine cpu advisor to improve clarity and fix several bugs * refactor(sysadvisor): abstract provision and headroom assembler for extensibility * test(sysadvisor): fix cpu advisor tests and bugs
What type of PR is this?
refactor & bugfix
Features/Bug fixes/Enhancements
see issue #85
What this PR does / why we need it:
see issue #85
Which issue(s) this PR fixes:
#85
Special notes for your reviewer: