Skip to content
This repository has been archived by the owner on Jul 7, 2019. It is now read-only.

Ignore nodes if out of syc. #26

Merged
merged 2 commits into from
Jun 11, 2019
Merged

Ignore nodes if out of syc. #26

merged 2 commits into from
Jun 11, 2019

Conversation

k82cn
Copy link

@k82cn k82cn commented Jun 10, 2019

Signed-off-by: Da K. Ma klaus1982.cn@gmail.com

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes kubernetes-retired#861

When kubelet was restarted, the device plugin may not report resource in time; so the volcano-sh/scheduler need ignore such kind of node to avoid panic.

The default scheduler does not have such kind of issue as there's not resources usage in node info.

Release note:

None

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
@k82cn
Copy link
Author

k82cn commented Jun 10, 2019

/cc @kinglion811 , @hex108 , @Jeffwan

@@ -549,6 +549,10 @@ func (sc *SchedulerCache) Snapshot() *kbapi.ClusterInfo {
}

for _, value := range sc.Nodes {
if !value.Ready() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always track unready nodes? I am thinking if scheduler should delete node over timeout?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point ! We should delete node if there's no pod on it when timeout :) I'm going to open a separate PR for that.

Copy link

@hex108 hex108 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k82cn k82cn added the lgtm Indicates that a PR is ready to be merged. label Jun 11, 2019
@k82cn
Copy link
Author

k82cn commented Jun 11, 2019

/approve

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2019
@volcano-sh-bot volcano-sh-bot merged commit 4b391ab into volcano-retired:master Jun 11, 2019
@k82cn k82cn deleted the kb_861 branch June 11, 2019 02:33
kevin-wangzefeng pushed a commit to kevin-wangzefeng/scheduler that referenced this pull request Jun 28, 2019
kevin-wangzefeng pushed a commit to kevin-wangzefeng/scheduler that referenced this pull request Jun 28, 2019
kevin-wangzefeng pushed a commit to kevin-wangzefeng/scheduler that referenced this pull request Jun 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v0.4 runtime panic
4 participants