Skip to content

Commit

Permalink
Add possible expansion for more aggressive preemption
Browse files Browse the repository at this point in the history
Change-Id: Ia59b99e062be825210a90dfdca0fa568447ce178
  • Loading branch information
alculquicondor committed Nov 24, 2022
1 parent fc02b53 commit a70e7c0
Showing 1 changed file with 35 additions and 11 deletions.
46 changes: 35 additions & 11 deletions keps/83-workload-preemption/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,28 +211,31 @@ type ClusterQueueSpec struct {
type PreemptionPolicy string
const (
PreemptionPoliyNever = "Never"
PreemptionPoliyLowerPriorityOnly = "LowerPriorityOnly"
PreemptionPoliyAlways = "Always"
PreemptionPoliyNever = "Never"
PreemptionPoliyReclaimFromLowerPriority = "ReclaimFromLowerPriority"
PreemptionPoliyReclaimFromAny = "ReclaimFromAny"
PreemptionPoliyLowerPriority = "LowerPriority"
)
type ClusterQueuePreemption struct {
// withinCohort determines whether a pending Workload that fits
// in the min quota for its ClusterQueue can preempt Workloads from other
// ClusterQueues in the cohort that are using more than their min quota.
// withinCohort determines whether a pending Workload can preempt Workloads
// from other ClusterQueues in the cohort that are using more than their min
// quota.
// Possible values are:
// - `Never` (default): do not preempt workloads in the cohort.
// - `LowerPriorityOnly`: only preempt workloads in the cohort that have lower
// priority than the pending Workload.
// - `Always`: preempt any workload in the cohort.
// - `ReclaimFromLowerPriority`: if the pending workload fits within the min
// quota of its ClusterQueue, only preempt workloads in the cohort that have
// lower priority than the pending Workload.
// - `ReclaimAny`: if the pending workload fits within the min quota of its
// ClusterQueue, preempt any workload in the cohort.
WithinCohort PreemptionPolicy

// withinClusterQueue determines whether a pending workload that doesn't fit
// within the min quota for its ClusterQueue, can preempt active Workloads in
// the ClusterQueue.
// Possible values are:
// - `Never` (default): do not preempt workloads in the ClusterQueue.
// - `LowerPriorityOnly`: only preempt workloads in the ClusterQueue that have
// - `LowerPriority`: only preempt workloads in the ClusterQueue that have
// lower priority than the pending Workload.
WithinClusterQueue PreemptionPolicy
}
Expand Down Expand Up @@ -338,7 +341,7 @@ The algorithm goes like follows:

1. For preemption within cohort, we restrict the list to Workloads with lower
priority than the pending Workload if
`.preemption.withinCohort=LowerPriorityOnly`
`.preemption.withinCohort=ReclaimFromLowerPriority`
2. For preemption within ClusterQueue, we only select Workloads with lower
priority than the pending Workload.

Expand Down Expand Up @@ -485,6 +488,27 @@ preemption, but they were left out of this KEP for lack of strong use cases.

We might add them back in the future, based on feedback.


### Allow high priority jobs to borrow quota while preempting

The proposed policies for preemption within cohort require that the Workload
fits within the min quota of the ClusterQueue. In other words, we don't try to
borrow quota when preempting.

It might be desired for higher priority workloads to preempt lower priority
workloads that are borrowing resources, even if it makes the ClusterQueue
borrow resources. This could be added as `.preemption.withinCohort=LowerPriority`.

The implementation could be like the following:

For each ClusterQueue, we consider the usage as the maximum of the min quota and
the actual used quota. Then, we select flavors for the pending workload based on
this simulated usage and run the preemption algorithm.

**Reasons for discarding/deferring**

It's unclear whether this behavior is useful and it adds complexity.

### Inform how costly is to interrupt a Workload

A workload might have a known cost of interruption that varies over time.
Expand Down

0 comments on commit a70e7c0

Please sign in to comment.