the states in PodGroup is not accurate #578

zwpaper · 2023-04-15T10:24:25Z

Area

Scheduler
Controller
Helm Chart
Documents

Other components

No response

What happened?

after migrated to controller runtime, podGroup.status.scheduled count will not be updated by PostBind, and the phase transform seem not working as expected

What did you expect to happen?

PodGroup will reflect on pods states change and update in PodGroup.status.phase

but it is now accurate and sometimes wrong, we need to discuss the expected states change and make it always right.

let's discuss the state flow before working on it:

currently, we have the following states:

Pending: pod group has been accepted by the system
Running: minMember pods of the pod group are in running phase.
PreScheduling: all pods of the pod group have enqueued and are waiting to be scheduled
Scheduling: partial pods have been scheduled and are in running phase, not meet minMember
Scheduled: minMember pods have been scheduled and are in running phase. @Huang-Wei, is this right? seems duplicated with running
Unknown: part of pods scheduled, and some not
Finished: minMember pods are successfully finished
Failed: at least one of pods have failed

scheduler-plugins/apis/scheduling/v1alpha1/types.go

Lines 87 to 112 in 9701eb8

    
           // PodGroupPending means the pod group has been accepted by the system, but scheduler can not allocate 
        
           // enough resources to it. 
        
           PodGroupPending PodGroupPhase = "Pending" 
        
           // PodGroupRunning means the `spec.minMember` pods of the pod group are in running phase. 
        
           PodGroupRunning PodGroupPhase = "Running" 
        
           // PodGroupPreScheduling means all pods of the pod group have enqueued and are waiting to be scheduled. 
        
           PodGroupPreScheduling PodGroupPhase = "PreScheduling" 
        
           // PodGroupScheduling means partial pods of the pod group have been scheduled and are in running phase 
        
           // but the number of running pods has not reached the `spec.minMember` pods of PodGroups. 
        
           PodGroupScheduling PodGroupPhase = "Scheduling" 
        
           // PodGroupScheduled means the `spec.minMember` pods of the pod group have been scheduled and are in running phase. 
        
           PodGroupScheduled PodGroupPhase = "Scheduled" 
        
           // PodGroupUnknown means a part of `spec.minMember` pods of the pod group have been scheduled but the others can not 
        
           // be scheduled due to, e.g. not enough resource; scheduler will wait for related controllers to recover them. 
        
           PodGroupUnknown PodGroupPhase = "Unknown" 
        
           // PodGroupFinished means the `spec.minMember` pods of the pod group are successfully finished. 
        
           PodGroupFinished PodGroupPhase = "Finished" 
        
           // PodGroupFailed means at least one of `spec.minMember` pods have failed. 
        
           PodGroupFailed PodGroupPhase = "Failed"

Please notice, the following phase only shows my understanding of the defined phase in the code, it may be a misunderstanding or could be discussed to improve

stateDiagram-v2
	state if_minMember <<choice>>
    [*] --> Pending
    Pending --> PreScheduling: pods added
    PreScheduling --> Scheduling: some of the pods scheduled
    Scheduling --> Scheduled: minMember pods scheduled, but not running
    Scheduled --> Running: minMember pods scheduled and running
    Running --> Failed: at least one of the pods failed
    Failed --> if_minMember: failed fixed
    if_minMember --> Scheduling: minMember does not meet
    if_minMember --> Scheduled: minMember meet
    Running --> Finished: all pods successfully finished
    Finished --> [*]

How can we reproduce it (as minimally and precisely as possible)?

create a podGroup with minMember 3
create 3 pods in podGroup
change 1 of the pods to make it unschedulable
we can see the phase not working as expected and scheduled count not right.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
1.25.7

Scheduler Plugins version

0.25.7

The text was updated successfully, but these errors were encountered:

Gekko0114 · 2023-04-15T13:04:35Z

Thank you for the summary. It's very clear. Here are my opinions:

Since Running and Scheduled are redundant, we can remove Scheduled.
Let's define the following statuses:

Pending: pod group has been accepted by the system
Running: minMember pods of the pod group are in running phase.
PreScheduling: all pods of the pod group have enqueued and are waiting to be scheduled
Scheduling: partial pods have been scheduled and are in running phase, not meet minMember
Unknown: part of pods scheduled, and some not
Finished: minMember pods are successfully finished
Failed: at least one of pods have failed

With the above definitions, PodGroupStatus.Scheduled becomes unnecessary and can be removed.

zwpaper · 2023-04-16T07:29:58Z

Since Running and Scheduled are redundant, we can remove Scheduled.

Scheduled could mean scheduled but not yet running? it may be the original design intention

With the above definitions, PodGroupStatus.Scheduled becomes unnecessary and can be removed

if we keep the scheduled phase, then status.scheduled would also be kept.

the point may be that is there a phase pods scheduled but not running and whether we should expose this phase to users.

for example, pods scheduled but stuck on a ContainerCreating or some other status

Gekko0114 · 2023-04-29T14:38:54Z

Hi @Huang-Wei,
I would like to hear your suggestions regarding this issue. Would it be possible for you to comment at your convenience?

Huang-Wei · 2023-05-02T00:29:49Z

I may delegate the review work to @denkensk as he was the original author. My 2 cents about the status revamp work:

if some status is transient, there is no point keeping it
if some status is nuanced to calculate while also exposing trivial information to end-user, we should eliminate it as well.

/assign @denkensk as primary reviewer.

zwpaper · 2023-05-02T16:31:01Z

/assign @denkensk

please take a look when you were free, the main point we were discussing is that:

is the status flow we figure out from code right?
is it still suitable after we migrated to ctrl runtime?
if we need to change, what is your idea?

Gekko0114 · 2023-05-10T13:16:43Z

Hi @denkensk,
I would like to hear your suggestions regarding this issue. Could you comment at your convenience?

denkensk · 2023-05-11T03:58:46Z

after migrated to controller runtime, podGroup.status.scheduled count will not be updated by PostBind, and the phase transform seem not working as expected

Can you describe why it doesn't work as expected in detail to help me understand this issue? @Gekko0114

denkensk · 2023-05-11T04:04:19Z

With the above definitions, PodGroupStatus.Scheduled becomes unnecessary and can be removed.

+1

Also we can discuss whether the following two status can be removed as well?

Scheduling
PreScheduling

Gekko0114 · 2023-05-11T13:36:11Z

Can you describe why it doesn't work as expected in detail to help me understand this issue?

@denkensk
Sure.
The transition of pg.Status.Phase to Scheduled is achieved by checking the number of pg.Status.Scheduled like this code.

However, pg.Status.Scheduled was only updated in PostBind logic.

That's why this problem occurred.
I've provided more detail here.
If you have any questions, please ask me.

Gekko0114 · 2023-05-11T13:46:51Z

Also we can discuss whether the following two status can be removed as well?

Scheduling
PreScheduling

I agree with the idea of removing the Scheduling phase.
In the current implementation, pg.Status.Phase would never transition to the Scheduling phase, so it can be removed.

However PreScheduling is used when the number of pods is bigger than minMember. It might be useful to keep it.

scheduler-plugins/pkg/controllers/podgroup_controller.go

Lines 111 to 112 in 33fe0eb

    
           pgCopy.Status.Phase = schedv1alpha1.PodGroupPreScheduling 
        
           fillOccupiedObj(pgCopy, &pods[0])

@denkensk , @zwpaper
What do you think?

zwpaper · 2023-05-15T15:55:32Z

Hi @Gekko0114, it sounds good to me to remove the "scheduling" phase, but it seems to be odd if we did not have a "scheduling" but got a "preScheduling".

so, how about we keep the scheduling name, but let it works like the preScheduling.

in conclusion, the status flow would be like this:

stateDiagram-v2
	state if_minMember <<choice>>
    [*] --> Pending
    Pending --> Scheduling: pods added
    Scheduling --> Running: minMember pods scheduled
    Running --> Failed: at least one of the pods failed
    Failed --> if_minMember: failed fixed
    if_minMember --> Scheduling: minMember does not meet
    if_minMember --> Running: minMember meet
    Running --> Finished: all pods successfully finished
    Finished --> [*]

denkensk · 2023-05-16T03:19:02Z

In the current implementation, pg.Status.Phase would never transition to the Scheduling phase, so it can be removed.

Thanks for your explain @Gekko0114

denkensk · 2023-05-16T03:55:59Z

Ok. From @zwpaper suggestion, we will have 6 phases. It looks good to me. It will be clean. WDYT? @Gekko0114 @Huang-Wei

Pending： pod group is created and accepted by the system
Scheduling： the number of pods is bigger than minMember
Running
Failed
Finished
Unknown

Gekko0114 · 2023-05-16T13:17:32Z

@zwpaper, @denkensk
Sure, I agree with you. Thanks for clarifying the discussion. I will implement it.

Gekko0114 · 2023-05-17T13:10:12Z

#574
Updated the PR.

Gekko0114 · 2023-07-01T01:17:57Z

/close

k8s-ci-robot · 2023-07-01T01:18:02Z

@Gekko0114: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Gekko0114 · 2023-07-01T01:18:48Z

We can close it since I completed PR

zwpaper · 2023-07-01T15:36:05Z

thanks @Gekko0114

/close

k8s-ci-robot · 2023-07-01T15:36:09Z

@zwpaper: Closing this issue.

In response to this:

thanks @Gekko0114

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

zwpaper added the kind/bug Categorizes issue or PR as related to a bug. label Apr 15, 2023

zwpaper mentioned this issue Apr 15, 2023

Coscheduling: remove podgroup.scheduled #574

Merged

k8s-ci-robot assigned denkensk May 2, 2023

k8s-ci-robot closed this as completed Jul 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the states in PodGroup is not accurate #578

the states in PodGroup is not accurate #578

zwpaper commented Apr 15, 2023 •

edited

Loading

Gekko0114 commented Apr 15, 2023

zwpaper commented Apr 16, 2023

Gekko0114 commented Apr 29, 2023

Huang-Wei commented May 2, 2023

zwpaper commented May 2, 2023

Gekko0114 commented May 10, 2023

denkensk commented May 11, 2023

denkensk commented May 11, 2023

Gekko0114 commented May 11, 2023 •

edited

Loading

Gekko0114 commented May 11, 2023

zwpaper commented May 15, 2023

denkensk commented May 16, 2023

denkensk commented May 16, 2023

Gekko0114 commented May 16, 2023

Gekko0114 commented May 17, 2023

Gekko0114 commented Jul 1, 2023

k8s-ci-robot commented Jul 1, 2023

Gekko0114 commented Jul 1, 2023

zwpaper commented Jul 1, 2023

k8s-ci-robot commented Jul 1, 2023

the states in PodGroup is not accurate #578

the states in PodGroup is not accurate #578

Comments

zwpaper commented Apr 15, 2023 • edited Loading

Area

Other components

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Scheduler Plugins version

Gekko0114 commented Apr 15, 2023

zwpaper commented Apr 16, 2023

Gekko0114 commented Apr 29, 2023

Huang-Wei commented May 2, 2023

zwpaper commented May 2, 2023

Gekko0114 commented May 10, 2023

denkensk commented May 11, 2023

denkensk commented May 11, 2023

Gekko0114 commented May 11, 2023 • edited Loading

Gekko0114 commented May 11, 2023

zwpaper commented May 15, 2023

denkensk commented May 16, 2023

denkensk commented May 16, 2023

Gekko0114 commented May 16, 2023

Gekko0114 commented May 17, 2023

Gekko0114 commented Jul 1, 2023

k8s-ci-robot commented Jul 1, 2023

Gekko0114 commented Jul 1, 2023

zwpaper commented Jul 1, 2023

k8s-ci-robot commented Jul 1, 2023

zwpaper commented Apr 15, 2023 •

edited

Loading

Gekko0114 commented May 11, 2023 •

edited

Loading