Support best effort FIFO #135

denkensk · 2022-03-21T09:19:17Z

Signed-off-by: Alex Wang wangqingcan1990@gmail.com

What type of PR is this?

/kind feature

What this PR does / why we need it:

In order to facilitate the review and to be able to quickly merge. I split the implementation into multiple prs.
This pr is add UnscheduleQ in the internal ClusterQueue struct.

Which issue(s) this PR fixes:

2/4 #8

Special notes for your reviewer:

ahg-g

I prefer not to merge this on its own, but we can iterate over it, and then once "lgmted" we add another commit to the same PR with the reset of the implementation, why do you think?

api/v1alpha1/clusterqueue_types.go

pkg/queue/queue.go

alculquicondor

Nothing to add to @ahg-g's comments.

denkensk · 2022-03-22T10:47:18Z

I prefer not to merge this on its own, but we can iterate over it, and then once "lgmted" we add another commit to the same PR with the reset of the implementation, why do you think?

sgtm. I add the reset of the implementation in this commit. I've tested it in a real cluster to make sure it works.
Pls take a look. @ahg-g @alculquicondor

And integration test will be added in the follow-up commit in this pr.

ahg-g

Thanks Alex, this is great as a first initial implementation. I have a couple of macro comments:

It would be nice if we can come up with an "interface" for QueueingStrategy that each queueing strategy implements and gets invoked agnostically by queue manager. This will allow us to establish a clear framework for when the queueing strategy could impact the logic rather than hardcoding things per strategy all over the place.
More immediate, we should document all events that could make a workload admissible, this is helpful to cross check if we actually covered all cases (and typically a test for each case).

api/v1alpha1/clusterqueue_types.go

pkg/queue/manager.go

pkg/queue/queue.go

denkensk · 2022-03-25T09:02:09Z

More immediate, we should document all events that could make a workload admissible, this is helpful to cross check if we actually covered all cases (and typically a test for each case).

Thanks @ahg-g I copy from https://docs.google.com/document/d/1VQ0qxWA-jwgvLq_WYG46OkXWW00O6q7b1BsR_Uv-acs/edit?usp=sharing I documented it before.

It is similar to the implementation of the default-scheduler, we need to trigger the re-queue operation based on the relevant events.

workloads deleted event --> move all the workloads in the same cohort back to heap.
cluster queue added event --> move all the workloads in the same cohort back to heap.
cluster queue updated event --> move all the workloads in the same cohort back to heap.

The first version a rough and simple operation. Next step, we need to select the workloads to move based on the exact event in the next step, like RequestableResources / NamespaceSelector / Cohort changed ....

denkensk · 2022-03-25T09:05:42Z

It would be nice if we can come up with an "interface" for QueueingStrategy that each queueing strategy implements and gets invoked agnostically by queue manager. This will allow us to establish a clear framework for when the queueing strategy could impact the logic rather than hardcoding things per strategy all over the place.

I created a new pr to come up with an interface of ClusterQueue #146
After it's merged. I will inherit the base implementation to refactor the besteffortFIFO.

ahg-g · 2022-03-25T19:09:18Z

It is similar to the implementation of the default-scheduler, we need to trigger the re-queue operation based on the relevant events.

workloads deleted event --> move all the workloads in the same cohort back to heap.

cluster queue added event --> move all the workloads in the same cohort back to heap.

cluster queue updated event --> move all the workloads in the same cohort back to heap.

can we please document those in the code.

denkensk · 2022-03-26T02:14:52Z

I created a new pr to come up with an interface of ClusterQueue #146
After it's merged. I will inherit the base implementation to refactor the besteffortFIFO.

Done

can we please document those in the code.

documented.

ahg-g

First round of review, we need integration tests, but lest leave that until we converge on the logic itself.

ahg-g · 2022-03-26T13:27:06Z

pkg/queue/clsuter_queue_best_effort_fifo.go

+func newClusterQueueBestEffortFIFO(cq *kueue.ClusterQueue) (ClusterQueue, error) {
+	cqImpl := ClusterQueueImpl{
+		heap: heapImpl{
+			less:  strictFIFO,


a bit surprising to see "strictFIFO" here, lets rename it byCreationTime

ahg-g · 2022-03-26T13:29:19Z

pkg/queue/clsuter_queue_best_effort_fifo.go

+	cqImpl := ClusterQueueImpl{
+		heap: heapImpl{
+			less:  strictFIFO,
+			items: make(map[string]*heapItem),


as a followup, I would put the heap in its own pkg, and have a NewHeap function, this is leaking implementation details.

ahg-g · 2022-03-26T13:32:28Z

pkg/queue/clsuter_queue_best_effort_fifo.go

+	item := cq.heap.items[workload.Key(w)]
+	info := *workload.NewInfo(w)
+	if item == nil {
+		heap.Push(&cq.heap, info)
+		return
+	}
+	item.obj = info
+	heap.Fix(&cq.heap, item.index)


can we call ClusterQueueImpl.PushOrUpdate(w) instead of replicating this code, that was my hope with the base/derived class analogy.

Same thing with the Delete function below.

I considered this in the first version. But it's hard to determine if the workload is new or re-queued.
One idea is to add a flag in workload info to be updated if the scheduling fails.

why do we need the flag? this piece of code seems like an exact replica of the PushOrUpdate logic; to be clear, I am suggesting the following:

func (cq *ClusterQueueBestEffortFIFO) PushOrUpdate(w *kueue.QueuedWorkload) { if oldInfo := cq.inadmissibleWorkloads.get(w); oldInfo != nil { cq.inadmissibleWorkloads.delete(w) } ClusterQueueImpl.PushOrUpdate(w) }

ahg-g · 2022-03-26T13:38:22Z

pkg/queue/clsuter_queue_best_effort_fifo.go

+}
+
+// Delete deletes a workload from the workloads.
+func (i *InadmissibleWorkloads) delete(w *kueue.QueuedWorkload) {


Input parameter should be the string key only, not the whole type. Same thing with the get function below

ahg-g · 2022-03-26T13:40:03Z

pkg/queue/clsuter_queue_best_effort_fifo.go

+}
+
+// Clear removes all the entries from the workloads.
+func (i *InadmissibleWorkloads) clear() {


this seems to be used in the unit test only, so I would remove the function (just so we offer a smaller api surface) and just do i.workloads = make(map[string]*workload.Info) in the unit test itself.

ahg-g · 2022-03-26T18:51:20Z

pkg/queue/clsuter_queue_best_effort_fifo.go

+	cq.inadmissibleWorkloads.delete(w)
+}
+
+func (cq *ClusterQueueBestEffortFIFO) RequeueWorkload(wInfo *workload.Info) (bool, error) {


why do we need a new interface? a reimplementation of PushOrUpdate should suffice.

Sorry the reply should be here.
I considered this in the first version. But it's hard to determine if the workload is new or re-queued.
One idea is to add a flag in workload info to be updated if the scheduling fails.

ok, lets address the other comments first and then come back to this at the end.

I leave the function AddInadmissibleIfNotPresent and QueueInadmissibleWorkloads only.

pkg/queue/cluster_queue_interface.go

ahg-g · 2022-03-26T19:06:46Z

pkg/queue/manager.go

@@ -296,7 +305,34 @@ func (m *Manager) deleteWorkloadFromQueueAndClusterQueue(w *kueue.QueuedWorkload
 	cq := m.clusterQueues[q.ClusterQueue]
 	if cq != nil {
 		cq.Delete(w)
+		if w.Spec.Admission != nil && m.queueAllInadmissibleWorkloadsInCohort(cq) {


deleteWorkloadFromQueueAndClusterQueue is invoked from DeleteWorkload above. DeleteWorkload isn't called when the workload was admitted (because it is not supposed to be in the queue):

kueue/pkg/controller/core/queuedworkload_controller.go

Line 134 in 7b655a7

if wl.Spec.Admission == nil {

but perhaps we should remove that condition and just call DeleteWorkload in all cases.

I think we should keep that logic in the controller. We also need to requeue elements in the case when a workload finishes. I think that's not covered in this PR yet.

I call the function DeleteWorkload in manager.go even if the workload was admitted.

See #135 (comment); I think it is better not to do it here and have an explicit API that we can evolve later to handle finer grained moving of workloads.

ahg-g · 2022-03-26T19:08:03Z

pkg/queue/manager.go

+	}
+
+	queued := false
+	for _, c := range m.clusterQueues {


We should try to do better than that by having a cohort to ClusterQueues index.

I'm fine doing it in a follow up. This PR is already complex enough.

ahg-g · 2022-03-26T19:10:36Z

pkg/queue/queue.go

@@ -113,3 +113,11 @@ func (h *heapImpl) Pop() interface{} {
 	delete(h.items, key)
 	return obj
 }
+
+// Get returns the requested item, or sets exists=false.


Suggested change

// Get returns the requested item, or sets exists=false.

// Get returns the requested item, or false if it doesn't exist.

is this implementing an interface? which one?

No. It isn't in the interface.

alculquicondor · 2022-03-28T15:41:25Z

pkg/queue/clsuter_queue_best_effort_fifo.go

@@ -0,0 +1,163 @@
+/*


the filename has a typo

alculquicondor · 2022-03-28T15:42:16Z

pkg/scheduler/scheduler.go

+	if ok {
+		log.V(2).Info("workload re-queued", "queuedWorkload", klog.KObj(w.Obj), "queue", klog.KRef(w.Obj.Namespace, w.Obj.Spec.QueueName))
+	} else if err != nil {
+		log.Error(err, "workload re-queued", "queuedWorkload", klog.KObj(w.Obj), "queue", klog.KRef(w.Obj.Namespace, w.Obj.Spec.QueueName))


Suggested change

log.Error(err, "workload re-queued", "queuedWorkload", klog.KObj(w.Obj), "queue", klog.KRef(w.Obj.Namespace, w.Obj.Spec.QueueName))

log.Error(err, "Failed re-queuing workload", "queuedWorkload", klog.KObj(w.Obj), "queue", klog.KRef(w.Obj.Namespace, w.Obj.Spec.QueueName))

alculquicondor · 2022-03-28T15:43:48Z

pkg/queue/queue.go

@@ -113,3 +113,11 @@ func (h *heapImpl) Pop() interface{} {
 	delete(h.items, key)
 	return obj
 }
+
+// Get returns the requested item, or sets exists=false.


is this implementing an interface? which one?

alculquicondor · 2022-03-28T15:48:18Z

pkg/queue/manager.go

+	}
+
+	queued := false
+	for _, c := range m.clusterQueues {


I'm fine doing it in a follow up. This PR is already complex enough.

alculquicondor · 2022-03-28T15:52:22Z

pkg/queue/manager.go

@@ -296,7 +305,34 @@ func (m *Manager) deleteWorkloadFromQueueAndClusterQueue(w *kueue.QueuedWorkload
 	cq := m.clusterQueues[q.ClusterQueue]
 	if cq != nil {
 		cq.Delete(w)
+		if w.Spec.Admission != nil && m.queueAllInadmissibleWorkloadsInCohort(cq) {


I think we should keep that logic in the controller. We also need to requeue elements in the case when a workload finishes. I think that's not covered in this PR yet.

alculquicondor · 2022-03-28T16:00:45Z

pkg/queue/cluster_queue_interface.go

+	// RequeueWorkload pushes the workload back to ClusterQueue after scheduling failure.
+	RequeueWorkload(*workload.Info) (bool, error)
+	// AddInadmissibleIfNotPresent inserts a workload that cannot be admitted into
+	// the inadmissibleWorkloads in ClusterQueue, unless it is already in the queue.
+	AddInadmissibleIfNotPresent(*workload.Info) error


We should only have the methods that the scheduler needs in the interface.

AddInadmissibleIfNotPresent and RequeIfNotPresent (previously named PushIfNotPresent).

The implementation of StrictFIFO should be that AddInadmissibleIfNotPresent just calls RequeIfNotPresent.

PushOrUpdate is what you would use to receive objects from the controllers, and that will create a workload.Info internally. The scheduler needs to push workload.Info objects as we might want to carry information for the next scheduling attempts.

I leave the function AddInadmissibleIfNotPresent and QueueInadmissibleWorkloads only.
Call PushIfNotPresent in AddInadmissibleIfNotPresent if it's the implementation of StrictFIFO. Pls take a look.

pkg/queue/cluster_queue_best_effort_fifo.go

pkg/queue/manager.go

pkg/queue/cluster_queue_best_effort_fifo.go

ahg-g · 2022-04-01T08:35:03Z

test/integration/scheduler/scheduler_test.go

+		workload1 := &kueue.QueuedWorkload{}
+		gomega.Eventually(func() bool {
+			lookupKey := types.NamespacedName{Name: job1.Name, Namespace: job1.Namespace}
+			err := k8sClient.Get(ctx, lookupKey, workload1)
+			return err == nil
+		}, framework.Timeout, framework.Interval).Should(gomega.BeTrue())
+


we don't need this eventually block

test/integration/scheduler/scheduler_test.go

denkensk · 2022-04-01T14:47:47Z

@ahg-g Thanks for your review in the early morning. Updated by all the comments. Pls review it again.

ahg-g

please squash.

/lgtm
/hold

leaving approve to @alculquicondor

ahg-g · 2022-04-01T15:20:12Z

test/integration/scheduler/scheduler_test.go

+		gomega.Eventually(func() bool {
+			lookupKey := types.NamespacedName{Name: job2.Name,
+				Namespace: job2.Namespace}
+			return k8sClient.Get(ctx, lookupKey,


ahg-g · 2022-04-01T15:20:41Z

test/integration/scheduler/scheduler_test.go

+		createdJob3 := &batchv1.Job{}
+		gomega.Eventually(func() bool {
+			lookupKey := types.NamespacedName{Name: job3.Name, Namespace: job3.Namespace}
+			return k8sClient.Get(ctx, lookupKey, createdJob3) == nil && !*createdJob3.Spec.Suspend


we can do the same here, return pointer bool

ahg-g · 2022-04-01T15:21:10Z

test/integration/scheduler/scheduler_test.go

+		createdJob2 := &batchv1.Job{}
+		gomega.Consistently(func() bool {
+			lookupKey := types.NamespacedName{Name: job2.Name, Namespace: job2.Namespace}
+			return k8sClient.Get(ctx, lookupKey, createdJob2) == nil && *createdJob2.Spec.Suspend


ditto, return pointer bool

ahg-g · 2022-04-01T15:21:57Z

test/integration/scheduler/scheduler_test.go

+		createdJob3 := &batchv1.Job{}
+		gomega.Eventually(func() bool {
+			lookupKey := types.NamespacedName{Name: job3.Name, Namespace: job3.Namespace}
+			return k8sClient.Get(ctx, lookupKey, createdJob3) == nil && !*createdJob3.Spec.Suspend


ahg-g · 2022-04-01T15:22:12Z

test/integration/scheduler/scheduler_test.go

+			Name).Request(corev1.ResourceCPU, "8").Obj()
+		gomega.Expect(k8sClient.Create(ctx, job2)).Should(gomega.Succeed())
+		createdJob2 := &batchv1.Job{}
+		gomega.Consistently(func() bool {


ditto, return pointer bool

ahg-g · 2022-04-01T15:22:40Z

test/integration/scheduler/scheduler_test.go

+
+		ginkgo.By("updating ClusterQueue")
+		devCq := &kueue.ClusterQueue{}
+		gomega.Eventually(func() bool {


return error

ahg-g · 2022-04-01T15:22:50Z

test/integration/scheduler/scheduler_test.go

+		gomega.Eventually(func() bool {
+			lookupKey := types.NamespacedName{Name: devBEClusterQ.Name}
+			return k8sClient.Get(ctx, lookupKey, devCq) == nil
+		}, framework.Timeout, framework.Interval).Should(gomega.BeTrue())


Suggested change

}, framework.Timeout, framework.Interval).Should(gomega.BeTrue())

}, framework.Timeout, framework.Interval).Should(gomega.Succeed())

ahg-g · 2022-04-01T15:23:15Z

test/integration/scheduler/scheduler_test.go

+				Namespace: job1.Namespace,
+			}})).Should(gomega.Succeed())
+
+		gomega.Eventually(func() bool {


ditto, return pointer bool

ahg-g · 2022-04-01T15:23:29Z

test/integration/scheduler/scheduler_test.go

+		job3 := testing.MakeJob("on-demand-job3", ns.Name).Queue(prodBEQueue.Name).Request(corev1.ResourceCPU, "2").Obj()
+		gomega.Expect(k8sClient.Create(ctx, job3)).Should(gomega.Succeed())
+		createdJob3 := &batchv1.Job{}
+		gomega.Eventually(func() bool {


ditto, return pointer bool

alculquicondor

This implementation is looking great!

pkg/controller/core/queuedworkload_controller.go

alculquicondor · 2022-04-01T15:32:06Z

pkg/queue/cluster_queue_best_effort_fifo.go

+		w.Finalizers = nil
+		return w
+	}
+	return !reflect.DeepEqual(strip(old), strip(new))


Use equality.Semantic.DeepEqual from k8s.io/apimachinery/pkg/api/equality

alculquicondor · 2022-04-01T15:34:47Z

pkg/queue/cluster_queue_best_effort_fifo.go

+// ClusterQueueBestEffortFIFO is the implementation for the ClusterQueue for
+// BestEffortFIFO.
+type ClusterQueueBestEffortFIFO struct {
+	*ClusterQueueImpl


nit: does it need to be a pointer?

pkg/queue/cluster_queue_best_effort_fifo.go

alculquicondor · 2022-04-01T15:53:50Z

pkg/queue/cluster_queue_interface.go

@@ -50,6 +49,13 @@ type ClusterQueue interface {
 	// queue is empty.
 	Pop() *workload.Info

+	// AddInadmissibleIfNotPresent inserts a workload that cannot be admitted into
+	// the inadmissibleWorkloads in ClusterQueue, unless it is already in the queue.


this is an implementation detail that doesn't apply to a strictFIFO queue.

Maybe you can say:

inserts a workload that could not be admitted back into the ClusterQueue. The implementation might choose to keep it in temporary placeholder stage where it doesn't compete with other workloads, until cluster events free up quota. The workload should not be reinserted if it's already in the ClusterQueue.

alculquicondor · 2022-04-01T15:55:23Z

pkg/queue/cluster_queue_interface.go

+	// AddInadmissibleIfNotPresent inserts a workload that cannot be admitted into
+	// the inadmissibleWorkloads in ClusterQueue, unless it is already in the queue.
+	AddInadmissibleIfNotPresent(*workload.Info) bool
+	// QueueInadmissibleWorkloads moves all workloads from inadmissibleWorkloads to heap.


again, implementation detail. You can refer to this as a notification. Maybe the name could be:

NotifyPotentialAvailability

I changed the comments to be less implementation detailed.
But maybe we still keep the func name --> QueueInadmissibleWorkloads. QueueInadmissibleWorkloads is easier to understand.
It's hard to find a proper name.

alculquicondor · 2022-04-01T16:00:22Z

pkg/queue/manager.go

@@ -109,6 +123,12 @@ func (m *Manager) UpdateClusterQueue(cq *kueue.ClusterQueue) error {
 	}
 	// TODO(#8): recreate heap based on a change of queueing policy.
 	cqImpl.Update(cq)


what if this changes the cohort?

I add the part to update the cohorts in manager.
But I'm curious if we need to deny users from updating the cohort of ClusterQueue in the future.

pkg/queue/manager.go

alculquicondor · 2022-04-01T16:07:05Z

pkg/scheduler/scheduler.go

@@ -400,6 +400,7 @@ func (e entryOrdering) Less(i, j int) bool {
 func (s *Scheduler) requeueAndUpdate(log logr.Logger, ctx context.Context, w *workload.Info, message string) {
 	added := s.queues.RequeueWorkload(ctx, w)


I think we might want to distinguish two scenarios of re-queueing (but fine to do in a follow up).

We have a case where we requeue because the workload didn't fit. But we also have a case where we requeue because we admitted workloads in another CQ that had higher priority, even though they might still fit in the next scheduling cycle. I don't think we should punish those workloads. They should skip the inadmissible stage.

Agree it.
As we just admit only one workload in the same Cohort in one scheduling cycle, maybe some other workloads are rejected directly and re-enqueue. We need to distinguish these two scenarios and skip them.

ahg-g · 2022-04-04T00:16:28Z

test/integration/scheduler/scheduler_test.go

+		gomega.Eventually(func() error {
+			lookupKey := types.NamespacedName{Name: devBEClusterQ.Name}
+			return k8sClient.Get(ctx, lookupKey, devCq)
+		}, framework.Timeout, framework.Interval).Should(gomega.Succeed())


no need for the eventually

Suggested change

gomega.Eventually(func() error {

lookupKey := types.NamespacedName{Name: devBEClusterQ.Name}

return k8sClient.Get(ctx, lookupKey, devCq)

}, framework.Timeout, framework.Interval).Should(gomega.Succeed())

gomega.Expect(k8sClient.Get(ctx, lookupKey, devCq)).Should(gomega.Succeed())

denkensk · 2022-04-04T00:42:11Z

I will squash these after Aldo confirms. :)

alculquicondor

/approve

Ready for squash

k8s-ci-robot · 2022-04-04T13:38:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, denkensk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [alculquicondor]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>

denkensk · 2022-04-04T14:28:52Z

squashed
Thanks for your review @ahg-g @alculquicondor

alculquicondor · 2022-04-04T14:29:43Z

/lgtm

ahg-g · 2022-04-04T14:34:50Z

/hold cancel

ahg-g · 2022-04-04T14:35:11Z

/lgtm

Thanks @denkensk , this is great!

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 21, 2022

k8s-ci-robot requested review from alculquicondor and ArangoGutierrez March 21, 2022 09:19

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 21, 2022

denkensk mentioned this pull request Mar 21, 2022

Efficient re-queueing of unschedulable workloads #8

Closed

denkensk force-pushed the add-unscheduleQ-clusterQueue branch from 980513e to e4b539d Compare March 21, 2022 09:28

ahg-g reviewed Mar 21, 2022

View reviewed changes

alculquicondor reviewed Mar 21, 2022

View reviewed changes

denkensk changed the title ~~Add UnscheduleQ in the internal ClusterQueue struct~~ WIP Add UnscheduleQ in the internal ClusterQueue struct Mar 22, 2022

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2022

denkensk force-pushed the add-unscheduleQ-clusterQueue branch from e4b539d to aac6f51 Compare March 22, 2022 10:33

denkensk changed the title ~~WIP Add UnscheduleQ in the internal ClusterQueue struct~~ WIP support best effort fifo Mar 22, 2022

denkensk force-pushed the add-unscheduleQ-clusterQueue branch from aac6f51 to 1f2edb2 Compare March 22, 2022 10:42

denkensk changed the title ~~WIP support best effort fifo~~ Support best effort FIFO Mar 22, 2022

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2022

ahg-g reviewed Mar 22, 2022

View reviewed changes

denkensk mentioned this pull request Mar 23, 2022

Rename unschedulableQ to unschedulablePods kubernetes/kubernetes#108919

Merged

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 24, 2022

denkensk force-pushed the add-unscheduleQ-clusterQueue branch from 1f2edb2 to f23d697 Compare March 25, 2022 08:49

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2022

denkensk force-pushed the add-unscheduleQ-clusterQueue branch from f23d697 to f828b1c Compare March 26, 2022 01:47

denkensk requested a review from ahg-g March 26, 2022 02:15

ahg-g reviewed Mar 26, 2022

View reviewed changes

alculquicondor reviewed Mar 28, 2022

View reviewed changes

denkensk force-pushed the add-unscheduleQ-clusterQueue branch 2 times, most recently from ddc123f to 0940bb6 Compare March 31, 2022 07:38

denkensk requested a review from ahg-g March 31, 2022 07:42

ahg-g reviewed Apr 1, 2022

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 1, 2022

k8s-ci-robot assigned ahg-g Apr 1, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 1, 2022

alculquicondor reviewed Apr 1, 2022

View reviewed changes

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 2, 2022

ahg-g reviewed Apr 4, 2022

View reviewed changes

alculquicondor reviewed Apr 4, 2022

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 4, 2022

denkensk added 2 commits April 4, 2022 22:23

Support besteffort fifo in clusterqueue

942806a

Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>

Add integration test for besteffort fifo

fdcb6a5

Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>

denkensk force-pushed the add-unscheduleQ-clusterQueue branch from fec27b3 to fdcb6a5 Compare April 4, 2022 14:24

k8s-ci-robot assigned alculquicondor Apr 4, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2022

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 4, 2022

k8s-ci-robot merged commit e5a7e61 into kubernetes-sigs:main Apr 4, 2022

This was referenced Apr 5, 2022

Change queueingStrategy default to BestEffortFIFO #173

Merged

Distinguish workload type before requeue #178

Merged

alculquicondor mentioned this pull request Apr 6, 2022

Update tolerated flavors integration test #179

Merged

	// Get returns the requested item, or sets exists=false.
	// Get returns the requested item, or false if it doesn't exist.

	log.Error(err, "workload re-queued", "queuedWorkload", klog.KObj(w.Obj), "queue", klog.KRef(w.Obj.Namespace, w.Obj.Spec.QueueName))
	log.Error(err, "Failed re-queuing workload", "queuedWorkload", klog.KObj(w.Obj), "queue", klog.KRef(w.Obj.Namespace, w.Obj.Spec.QueueName))

	}, framework.Timeout, framework.Interval).Should(gomega.BeTrue())
	}, framework.Timeout, framework.Interval).Should(gomega.Succeed())

		@@ -400,6 +400,7 @@ func (e entryOrdering) Less(i, j int) bool {
		func (s Scheduler) requeueAndUpdate(log logr.Logger, ctx context.Context, w workload.Info, message string) {
		added := s.queues.RequeueWorkload(ctx, w)

Support best effort FIFO #135

Support best effort FIFO #135

Conversation

denkensk commented Mar 21, 2022

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

ahg-g left a comment

Choose a reason for hiding this comment

alculquicondor left a comment

Choose a reason for hiding this comment

denkensk commented Mar 22, 2022

ahg-g left a comment

Choose a reason for hiding this comment

denkensk commented Mar 25, 2022

denkensk commented Mar 25, 2022

ahg-g commented Mar 25, 2022

denkensk commented Mar 26, 2022

ahg-g left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahg-g Mar 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

denkensk commented Apr 1, 2022

ahg-g left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alculquicondor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahg-g Mar 28, 2022 •

edited

Loading