Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

piraeus-csi-controller-0 pod pending, insufficient cpu #19

Closed
mariusrugan opened this issue Mar 8, 2020 · 7 comments
Closed

piraeus-csi-controller-0 pod pending, insufficient cpu #19

mariusrugan opened this issue Mar 8, 2020 · 7 comments

Comments

@mariusrugan
Copy link

mariusrugan commented Mar 8, 2020

Hi,

i wanted to test Piraeus and spawned a cluster of 1+4 worker nodes, ubuntu 18.04 on each, running on a kvm/libvrt setup, each 4GB RAM and 2 CPU (below output of /proc/cpuinfo). The host is a 32GB RAM intel nuc with 4 core Intel(R) Core(TM) i5-6260U CPU @ 1.80GHz.
The cluster was spawned for this particular test, no other workload is running, other than flannel.

It seems pod piraeus-csi-controller-0 is Pending due to

Warning FailedScheduling 3m25s (x39 over 53m) default-scheduler 0/5 nodes are available: 5 Insufficient cpu.

wrt to the Piraeus setup some nodes are also Pending as per below:

Warning FailedScheduling 23s (x44 over 56m) default-scheduler 0/5 nodes are available: 1 Insufficient cpu, 4 node(s) didn't match node selector.

Any advice ?
Thanks !

kube-system   piraeus-controller-0             0/1     Init:0/1   0          55m
kube-system   piraeus-csi-controller-0         0/5     Pending    0          55m
kube-system   piraeus-csi-node-7xht4           0/2     Pending    0          55m
kube-system   piraeus-csi-node-9d975           2/2     Running    0          55m
kube-system   piraeus-csi-node-9zkkl           2/2     Running    0          55m
kube-system   piraeus-csi-node-k8wlp           2/2     Running    0          55m
kube-system   piraeus-etcd-0                   1/1     Running    0          55m
kube-system   piraeus-etcd-1                   1/1     Running    0          55m
kube-system   piraeus-etcd-2                   1/1     Running    0          55m
kube-system   piraeus-node-dmqxh               0/1     Init:0/1   0          55m
kube-system   piraeus-node-jxm2j               0/1     Init:0/1   0          55m
kube-system   piraeus-node-ldslx               0/1     Init:0/1   0          55m
kube-system   piraeus-node-wwj2s               0/1     Init:0/1   0          55m

/proc/cpuinfo for the node.

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel Core Processor (Broadwell, no TSX, IBRS)
stepping	: 2
microcode	: 0x1
cpu MHz		: 1799.998
cache size	: 16384 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 arat md_clear
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 3599.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel Core Processor (Broadwell, no TSX, IBRS)
stepping	: 2
microcode	: 0x1
cpu MHz		: 1799.998
cache size	: 16384 KB
physical id	: 1
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 arat md_clear
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 3599.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:
@mpepping
Copy link

Use bigger VM's 😄. The controller container has a req/limit defined of 1 CPU. Either add requests element with a lower cpu value (probably not recommended), or use VM's with more CPU's. Consider running three workers with 3 CPU's instead of four workers with 2 CPU's?

https://raw.githubusercontent.com/piraeusdatastore/piraeus/master/deploy/all.yaml:

[..]
      containers:
      - name: controller
        image: quay.io/piraeusdatastore/piraeus-server:v1.4.2
        imagePullPolicy: Always
        resources:
          limits:
            cpu: "1"
            memory: 1Gi
[..]

@mariusrugan
Copy link
Author

I did @mpepping , i've over-spec'ed them, (you can push kvm a bit with total sum being higher than the host machine & also without swap), with 8 GB of RAM and 4 CPUs, but ended up having the same issue.

Indeed, i shall retry, i will do 3 workers, beefed up, to see if i can figure it out.

... and it's a bit strange, as from a linstor perspective, i have it up and running via another implementation (https://github.com/kvaps/kube-linstor) with the same hardware & the same-spec'ed vms.

@mpepping
Copy link

Using 8 cpu /w 16GB nodes worked out OK for me, until I ran into #20. Thanks for pointing out kube-linstor, @mariusrugan. Will have a look at that.

@mariusrugan
Copy link
Author

@mpepping
i'm coming to check Piraeus from this article https://vitobotta.com/2020/01/04/linstor-storage-the-kubernetes-way/

@alemonmk
Copy link

alemonmk commented Mar 25, 2020

I believe there is some thing in kubernetes that it reserves resources even though it was not instructed to (i.e. there is no resources.requests from *Set). I retrieve the pod spec from kubernetes and found out that resources.requests appear out of thin air.

@alexzhc
Copy link
Member

alexzhc commented Mar 27, 2020

I have just updated all.yaml with resource limit set to the minimum requirement. Please try the latest https://raw.githubusercontent.com/piraeusdatastore/piraeus/master/deploy/all.yaml

@alexzhc
Copy link
Member

alexzhc commented Mar 27, 2020

POD Limit (cpu/mem)
etcd 100m/100Mi
init 100m/100Mi
controller 500m/500Mi
satellite 300m/300Mi
csi-provisioner 100m/100Mi
csi-attacher 100m/100Mi
csi-snapshotter 100m/100Mi
csi-cluster-driver-registrar 100m/100Mi
csi-plugin 100m/100Mi
csi-node-driver-registrar 100m/100Mi

@alexzhc alexzhc closed this as completed Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants