piraeus-csi-controller-0 pod pending, insufficient cpu #19

mariusrugan · 2020-03-08T20:44:44Z

Hi,

i wanted to test Piraeus and spawned a cluster of 1+4 worker nodes, ubuntu 18.04 on each, running on a kvm/libvrt setup, each 4GB RAM and 2 CPU (below output of /proc/cpuinfo). The host is a 32GB RAM intel nuc with 4 core Intel(R) Core(TM) i5-6260U CPU @ 1.80GHz.
The cluster was spawned for this particular test, no other workload is running, other than flannel.

It seems pod piraeus-csi-controller-0 is Pending due to

Warning FailedScheduling 3m25s (x39 over 53m) default-scheduler 0/5 nodes are available: 5 Insufficient cpu.

wrt to the Piraeus setup some nodes are also Pending as per below:

Warning FailedScheduling 23s (x44 over 56m) default-scheduler 0/5 nodes are available: 1 Insufficient cpu, 4 node(s) didn't match node selector.

Any advice ?
Thanks !

kube-system   piraeus-controller-0             0/1     Init:0/1   0          55m
kube-system   piraeus-csi-controller-0         0/5     Pending    0          55m
kube-system   piraeus-csi-node-7xht4           0/2     Pending    0          55m
kube-system   piraeus-csi-node-9d975           2/2     Running    0          55m
kube-system   piraeus-csi-node-9zkkl           2/2     Running    0          55m
kube-system   piraeus-csi-node-k8wlp           2/2     Running    0          55m
kube-system   piraeus-etcd-0                   1/1     Running    0          55m
kube-system   piraeus-etcd-1                   1/1     Running    0          55m
kube-system   piraeus-etcd-2                   1/1     Running    0          55m
kube-system   piraeus-node-dmqxh               0/1     Init:0/1   0          55m
kube-system   piraeus-node-jxm2j               0/1     Init:0/1   0          55m
kube-system   piraeus-node-ldslx               0/1     Init:0/1   0          55m
kube-system   piraeus-node-wwj2s               0/1     Init:0/1   0          55m

/proc/cpuinfo for the node.

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel Core Processor (Broadwell, no TSX, IBRS)
stepping	: 2
microcode	: 0x1
cpu MHz		: 1799.998
cache size	: 16384 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 arat md_clear
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 3599.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel Core Processor (Broadwell, no TSX, IBRS)
stepping	: 2
microcode	: 0x1
cpu MHz		: 1799.998
cache size	: 16384 KB
physical id	: 1
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 arat md_clear
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 3599.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

The text was updated successfully, but these errors were encountered:

mpepping · 2020-03-11T20:05:16Z

Use bigger VM's 😄. The controller container has a req/limit defined of 1 CPU. Either add requests element with a lower cpu value (probably not recommended), or use VM's with more CPU's. Consider running three workers with 3 CPU's instead of four workers with 2 CPU's?

https://raw.githubusercontent.com/piraeusdatastore/piraeus/master/deploy/all.yaml:

[..]
      containers:
      - name: controller
        image: quay.io/piraeusdatastore/piraeus-server:v1.4.2
        imagePullPolicy: Always
        resources:
          limits:
            cpu: "1"
            memory: 1Gi
[..]

mariusrugan · 2020-03-11T20:26:22Z

I did @mpepping , i've over-spec'ed them, (you can push kvm a bit with total sum being higher than the host machine & also without swap), with 8 GB of RAM and 4 CPUs, but ended up having the same issue.

Indeed, i shall retry, i will do 3 workers, beefed up, to see if i can figure it out.

... and it's a bit strange, as from a linstor perspective, i have it up and running via another implementation (https://github.com/kvaps/kube-linstor) with the same hardware & the same-spec'ed vms.

mpepping · 2020-03-11T20:49:21Z

Using 8 cpu /w 16GB nodes worked out OK for me, until I ran into #20. Thanks for pointing out kube-linstor, @mariusrugan. Will have a look at that.

mariusrugan · 2020-03-11T21:05:45Z

@mpepping
i'm coming to check Piraeus from this article https://vitobotta.com/2020/01/04/linstor-storage-the-kubernetes-way/

alemonmk · 2020-03-25T10:36:50Z

I believe there is some thing in kubernetes that it reserves resources even though it was not instructed to (i.e. there is no resources.requests from *Set). I retrieve the pod spec from kubernetes and found out that resources.requests appear out of thin air.

alexzhc · 2020-03-27T16:18:30Z

I have just updated all.yaml with resource limit set to the minimum requirement. Please try the latest https://raw.githubusercontent.com/piraeusdatastore/piraeus/master/deploy/all.yaml

alexzhc · 2020-03-27T17:33:11Z

POD	Limit (cpu/mem)
etcd	100m/100Mi
init	100m/100Mi
controller	500m/500Mi
satellite	300m/300Mi
csi-provisioner	100m/100Mi
csi-attacher	100m/100Mi
csi-snapshotter	100m/100Mi
csi-cluster-driver-registrar	100m/100Mi
csi-plugin	100m/100Mi
csi-node-driver-registrar	100m/100Mi

alemonmk mentioned this issue Mar 25, 2020

Defining container resources limit implies resource requests kubernetes/kubernetes#89469

Closed

alexzhc closed this as completed Mar 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

piraeus-csi-controller-0 pod pending, insufficient cpu #19

piraeus-csi-controller-0 pod pending, insufficient cpu #19

mariusrugan commented Mar 8, 2020 •

edited

Loading

mpepping commented Mar 11, 2020

mariusrugan commented Mar 11, 2020

mpepping commented Mar 11, 2020

mariusrugan commented Mar 11, 2020

alemonmk commented Mar 25, 2020 •

edited

Loading

alexzhc commented Mar 27, 2020

alexzhc commented Mar 27, 2020

piraeus-csi-controller-0 pod pending, insufficient cpu #19

piraeus-csi-controller-0 pod pending, insufficient cpu #19

Comments

mariusrugan commented Mar 8, 2020 • edited Loading

mpepping commented Mar 11, 2020

mariusrugan commented Mar 11, 2020

mpepping commented Mar 11, 2020

mariusrugan commented Mar 11, 2020

alemonmk commented Mar 25, 2020 • edited Loading

alexzhc commented Mar 27, 2020

alexzhc commented Mar 27, 2020

mariusrugan commented Mar 8, 2020 •

edited

Loading

alemonmk commented Mar 25, 2020 •

edited

Loading