Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kurtosis schedules pods to the same node, even if there are multiple nodes are available #953

Closed
barnabasbusa opened this issue Jul 20, 2023 · 2 comments
Labels
bug Something isn't working painful Painful bug

Comments

@barnabasbusa
Copy link
Collaborator

What's your CLI version?

0.80.12

Description & steps to reproduce

I use digitalocean as a kubernetes cluster. The cluster has 2 nodes, with max nodes set to 6.

NAME                       STATUS   ROLES    AGE   VERSION
chaos-ams3-default-f1158   Ready    <none>   2d    v1.27.2
chaos-ams3-default-fzppo   Ready    <none>   92m   v1.27.2

I used the config below to deploy a workload on this cluster:

{
  "participants": [
    {
      "el_client_type": "geth",
      "el_client_image": "ethpandaops/geth:4844-devnet-6-e03b5ad",
      "cl_client_type": "lighthouse",
      "cl_client_image": "ethpandaops/lighthouse:boxed-blobs-d534ac0",
      "count": 8
    }
  ],
  "network_params": {
    "deneb_fork_epoch": 3
  },
  "launch_additional_services": true,
  "wait_for_finalization": false,
  "wait_for_verifications": false,
  "global_client_log_level": "info"
}

Its expected to spin up 8 pairs of ethereum nodes.

However, most of these pods getting killed due to running out of resources.

When inspecting the cluster, I can see that all the pods got scheduled to the same node, which will not be sufficient to run all these containers:

k -n kurtosis-enclave--5f9291d1b4414340a936ee2717bbfd2c get pods -owide
NAME                             READY   STATUS                   RESTARTS   AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
cl-1-lighthouse-geth             1/1     Running                  0          2m22s   10.244.0.72    chaos-ams3-default-fzppo   <none>           <none>
cl-1-lighthouse-geth-validator   1/1     Running                  0          2m10s   10.244.0.6     chaos-ams3-default-fzppo   <none>           <none>
cl-2-lighthouse-geth             1/1     Running                  0          2m7s    10.244.0.85    chaos-ams3-default-fzppo   <none>           <none>
cl-2-lighthouse-geth-validator   1/1     Running                  0          2m3s    10.244.0.25    chaos-ams3-default-fzppo   <none>           <none>
cl-3-lighthouse-geth             1/1     Running                  0          118s    10.244.0.83    chaos-ams3-default-fzppo   <none>           <none>
cl-3-lighthouse-geth-validator   1/1     Running                  0          114s    10.244.0.93    chaos-ams3-default-fzppo   <none>           <none>
cl-4-lighthouse-geth             1/1     Running                  0          110s    10.244.0.26    chaos-ams3-default-fzppo   <none>           <none>
cl-4-lighthouse-geth-validator   1/1     Running                  0          104s    10.244.0.16    chaos-ams3-default-fzppo   <none>           <none>
cl-5-lighthouse-geth             1/1     Running                  0          100s    10.244.0.84    chaos-ams3-default-fzppo   <none>           <none>
cl-5-lighthouse-geth-validator   1/1     Running                  0          96s     10.244.0.43    chaos-ams3-default-fzppo   <none>           <none>
cl-6-lighthouse-geth             1/1     Running                  0          92s     10.244.0.47    chaos-ams3-default-fzppo   <none>           <none>
cl-6-lighthouse-geth-validator   1/1     Running                  0          87s     10.244.0.21    chaos-ams3-default-fzppo   <none>           <none>
el-1-geth-lighthouse             0/1     OOMKilled                0          3m59s   10.244.0.26    chaos-ams3-default-fzppo   <none>           <none>
el-2-geth-lighthouse             0/1     OOMKilled                0          3m51s   10.244.0.120   chaos-ams3-default-fzppo   <none>           <none>
el-3-geth-lighthouse             0/1     OOMKilled                0          3m47s   10.244.0.37    chaos-ams3-default-fzppo   <none>           <none>
el-4-geth-lighthouse             1/1     Running                  0          3m42s   10.244.0.3     chaos-ams3-default-fzppo   <none>           <none>
el-5-geth-lighthouse             0/1     ContainerStatusUnknown   1          3m36s   10.244.0.27    chaos-ams3-default-fzppo   <none>           <none>
el-6-geth-lighthouse             0/1     OOMKilled                0          3m28s   10.244.0.30    chaos-ams3-default-fzppo   <none>           <none>
el-7-geth-lighthouse             0/1     OOMKilled                0          3m17s   10.244.0.13    chaos-ams3-default-fzppo   <none>           <none>
el-8-geth-lighthouse             1/1     Running                  0          3m10s   10.244.0.48    chaos-ams3-default-fzppo   <none>           <none>
kurtosis-api                     1/1     Running                  0          5m21s   10.244.0.81    chaos-ams3-default-fzppo   <none>           <none>

I have a feeling that kurtosis somehow trying to handle pod scheduling instead of letting the kubernetes scheduler to do this for it.

Desired behavior

Inspect how many nodes there are available, and based on that do Round Robin distribution of the node pairs into different machines.
Working some magic with an autoscaler would be icing on top.

What is the severity of this bug?

Painful; this is causing significant friction in my workflow.

@barnabasbusa barnabasbusa added the bug Something isn't working label Jul 20, 2023
@github-actions github-actions bot added the painful Painful bug label Jul 20, 2023
@mieubrisse
Copy link
Collaborator

mieubrisse commented Jul 20, 2023

Oh that's super weird; we don't touch the scheduling algorithm at all - just throw Pods at Kubernetes and let it do its thing. I suspect that it's related to your discussion on #952 , where - because the resource limits aren't getting set - Kubernetes thinks "oh these are very light Pods" and just throws them all on the same node, but in reality they're very heavy. If you were to hack in a min_memory requirement to the ServiceConfig that you're using, does that temporarily solve the issue?

@mieubrisse
Copy link
Collaborator

And re.

Working some magic with an autoscaler would be icing on top.

Coming in the next 1-2 months ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working painful Painful bug
Projects
None yet
Development

No branches or pull requests

2 participants