Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl apply -f volcano-development.yaml latest,volcano-scheduler'status is CrashLoopBackOff #1058

Closed
huone1 opened this issue Sep 21, 2020 · 4 comments
Labels
area/scheduling kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@huone1
Copy link
Contributor

huone1 commented Sep 21, 2020

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
kubectl apply -f volcano-development.yaml, volcano-scheduler'status is CrashLoopBackOff

volcano-system volcano-scheduler-7c5d7bbcd4-s26kc 0/1 CrashLoopBackOff 12 46m
I Collect same information, below as :

  1. execute "kubectl get queue", dont't find the default queue here.
  2. check the volcano-controllers logs,
E0921 08:17:06.816174       1 reflector.go:382] volcano.sh/volcano/pkg/controllers/queue/queue_controller.go:178: Failed to watch *v1beta1.Queue: the server could not find the requested resource (get queues.scheduling.volcano.sh)
I0921 08:17:07.781585       1 reflector.go:211] Listing and watching *v1beta1.Queue from volcano.sh/volcano/pkg/controllers/queue/queue_controller.go:178
E0921 08:17:07.782454       1 reflector.go:178] volcano.sh/volcano/pkg/controllers/queue/queue_controller.go:178: Failed to list *v1beta1.Queue: the server could not find the requested resource (get queues.scheduling.volcano.sh)
  1. kube-system pod status
# kubectl get pod -A
NAMESPACE        NAME                                  READY   STATUS             RESTARTS   AGE
kube-system      coredns-66bff467f8-fhcw5              1/1     Running            0          5h50m
kube-system      coredns-66bff467f8-k9qtj              1/1     Running            0          5h50m
kube-system      etcd-ecs-f948                         1/1     Running            0          5h50m
kube-system      kube-apiserver-ecs-f948               1/1     Running            0          5h50m
kube-system      kube-controller-manager-ecs-f948      1/1     Running            0          5h50m
kube-system      kube-flannel-ds-amd64-nz9kh           1/1     Running            0          5h47m
kube-system      kube-proxy-dzq2l                      1/1     Running            0          5h50m
kube-system      kube-scheduler-ecs-f948               1/1     Running            0          5h50m
volcano-system   volcano-admission-696868c79b-pftmk    1/1     Running            0          46m
volcano-system   volcano-admission-init-8vpc4          0/1     Completed          0          46m
volcano-system   volcano-controllers-d8578c484-z5z6n   1/1     Running            0          46m
volcano-system   volcano-scheduler-7c5d7bbcd4-s26kc    0/1     CrashLoopBackOff   12         46m

What you expected to happen:
volcano-scheduler'status is normal

How to reproduce it (as minimally and precisely as possible):
Maybe the reason is my environment's problem
my master step is kubernetes version update 1.17.0 to 1.18.3
In the process, some things isn't cleared that causes this problem.

Anything else we need to know?:

Environment:

  • Volcano Version: latest
  • Kubernetes version (use kubectl version): v1.18.3
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release): centos7.6
  • Kernel (e.g. uname -a):Linux ecs-f948 3.10.0-1062.12.1.el7.x86_64 Rename hpw.cloud keyword to volcano.sh #1 SMP Tue Feb 4 23:02:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Others:
@volcano-sh-bot volcano-sh-bot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 21, 2020
@Thor-wl
Copy link
Contributor

Thor-wl commented Sep 21, 2020

This may be a bug, I got scheduler log like this:

I0921 09:29:33.174340       1 flags.go:52] FLAG: --add-dir-header="false"
I0921 09:29:33.174377       1 flags.go:52] FLAG: --alsologtostderr="false"
I0921 09:29:33.174381       1 flags.go:52] FLAG: --default-queue="default"
I0921 09:29:33.174385       1 flags.go:52] FLAG: --kube-api-burst="100"
I0921 09:29:33.174389       1 flags.go:52] FLAG: --kube-api-qps="50"
I0921 09:29:33.174396       1 flags.go:52] FLAG: --kubeconfig=""
I0921 09:29:33.174399       1 flags.go:52] FLAG: --leader-elect="false"
I0921 09:29:33.174402       1 flags.go:52] FLAG: --listen-address=":8080"
I0921 09:29:33.174405       1 flags.go:52] FLAG: --lock-object-namespace=""
I0921 09:29:33.174408       1 flags.go:52] FLAG: --log-backtrace-at=":0"
I0921 09:29:33.174413       1 flags.go:52] FLAG: --log-dir=""
I0921 09:29:33.174416       1 flags.go:52] FLAG: --log-file=""
I0921 09:29:33.174419       1 flags.go:52] FLAG: --log-file-max-size="1800"
I0921 09:29:33.174422       1 flags.go:52] FLAG: --log-flush-frequency="5s"
I0921 09:29:33.174430       1 flags.go:52] FLAG: --logtostderr="true"
I0921 09:29:33.174433       1 flags.go:52] FLAG: --master=""
I0921 09:29:33.174436       1 flags.go:52] FLAG: --minimum-feasible-nodes="100"
I0921 09:29:33.174440       1 flags.go:52] FLAG: --minimum-percentage-nodes-to-find="5"
I0921 09:29:33.174445       1 flags.go:52] FLAG: --percentage-nodes-to-find="100"
I0921 09:29:33.174451       1 flags.go:52] FLAG: --priority-class="true"
I0921 09:29:33.174454       1 flags.go:52] FLAG: --schedule-period="1s"
I0921 09:29:33.174457       1 flags.go:52] FLAG: --scheduler-conf="/volcano.scheduler/volcano-scheduler.conf"
I0921 09:29:33.174461       1 flags.go:52] FLAG: --scheduler-name="volcano"
I0921 09:29:33.174464       1 flags.go:52] FLAG: --skip-headers="false"
I0921 09:29:33.174467       1 flags.go:52] FLAG: --skip-log-headers="false"
I0921 09:29:33.174469       1 flags.go:52] FLAG: --stderrthreshold="2"
I0921 09:29:33.174472       1 flags.go:52] FLAG: --v="3"
I0921 09:29:33.174475       1 flags.go:52] FLAG: --version="false"
I0921 09:29:33.174478       1 flags.go:52] FLAG: --vmodule=""
W0921 09:29:33.174500       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
panic: failed init default queue, with err: context deadline exceeded

goroutine 1 [running]:
volcano.sh/volcano/pkg/scheduler/cache.newSchedulerCache(0xc00034e400, 0x18acf0a, 0x7, 0x18ac379, 0x7, 0x0)
	/home/travis/gopath/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:282 +0x204b
volcano.sh/volcano/pkg/scheduler/cache.New(...)
	/home/travis/gopath/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:67
volcano.sh/volcano/pkg/scheduler.NewScheduler(0xc00034e400, 0x18acf0a, 0x7, 0x7ffc64088cb2, 0x29, 0x3b9aca00, 0x18ac379, 0x7, 0x0, 0x27779a0, ...)
	/home/travis/gopath/src/volcano.sh/volcano/pkg/scheduler/scheduler.go:73 +0x93
volcano.sh/volcano/cmd/scheduler/app.Run(0xc000322780, 0x1965708, 0x1965a98)
	/home/travis/gopath/src/volcano.sh/volcano/cmd/scheduler/app/server.go:66 +0x121
main.main()
	/home/travis/gopath/src/volcano.sh/volcano/cmd/scheduler/main.go:64 +0x1ac

I will continue to track it.

@k82cn k82cn added area/scheduling priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Sep 22, 2020
@k82cn
Copy link
Member

k82cn commented Sep 22, 2020

/cc

@Thor-wl
Copy link
Contributor

Thor-wl commented Sep 24, 2020

The root reason is that the request of creating default queue cannot reach admission webhook from apiserver. The apiserver log shows timeout but the webhook log shows nothing to reach. I've tried as the installation and everything goes well. Please check y our environment.

@huone1
Copy link
Contributor Author

huone1 commented Oct 10, 2020

thanks

@huone1 huone1 closed this as completed Oct 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/scheduling kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

4 participants