Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrapper in Kubernetes not able to to get local containers #53

Open
o0n1x opened this issue Aug 2, 2023 · 5 comments
Open

bootstrapper in Kubernetes not able to to get local containers #53

o0n1x opened this issue Aug 2, 2023 · 5 comments

Comments

@o0n1x
Copy link

o0n1x commented Aug 2, 2023

Hello again,

I have moved to another orchestrator which is Kubernetes, but I have ran into another problem. it seems the bootstrapper does not find the other containers using self.low_level_client.containers() in KubernetesBootstrapper.py, but it was able to get the pods using self.high_level_client.list_namespaced_pod('default') in KubernetesBootstrapper.py.

I have attempted the same steps to start Kubernetes as shown in the orchestrators.md file and build the yaml file needed from the iperf3 example topology file. the problem persists after multiple attempts and on different devices. the experiment is done on single device every time, so the network shouldn't be an issue.

the root issue to this problem is that the dashboard does not start, and can't be accessed in the local device.

logs in the bootstrapper pod::

[Py (Bootstrapper)] Kubernetes bootstrapping started...
[Py (Bootstrapper)] bootstrapping all containers with label 85a6c7fd-a109-4699-ab9f-69cef33f8c00.
[Py (god)] found god_id.
[Py (god)] ip: (not yet known), nr. of gods: 1
[Py (god1)] 10.68.186.109 :: ['HELLO', '73240190880512984500490729979116919482']
[Py (god2)] 10.68.186.109 :: READY
[Py (god)] ip: 10.68.186.109, nr. of gods: 1
[Py (god)] local IP: 10.68.186.109,
           remote IPs:
[Py (god)] resolved all IPs
[Py (god)] started rust handler.

and stops at that point.

logs in bootstrapper pod with debug logs I placed in KubernetesBootstrapper.py :

[Py (god)] test - entered main while loop                #( placed right after the main while loop in KubernetesBootstrapper.py)
[Py (god test)] []                                      #( used to display the result from self.low_level_client.containers())
[Py (god)] test - entered for loop 2.      #( placed right after the second for loop in bootstrapper.py)
[Py (god)] d://3c8aea61a5d763fdf2e8ea0c8f66a341e78b480817d38959f24e61b642ce4bd5.      #( used to display container_id)
[Py (god)] []                                             #(used to display local_container list)
[Py (dgod)] {}                                           #(used to display self.already_bootstrapped)
[Py (god)] {'started_at': datetime.datetime(2023, 8, 2, 7, 44, 14, tzinfo=tzlocal())}                           #( used to display pod.status.container_statuses[0].state.running)
[Py (god)] test - entered for loop 2
[Py (god)] d://c813876ba2820541aa1c04e5e3759f87b61fbd57b154a06c64f9966019c53bec
[Py (god)] []
[Py (dgod)] {}
[Py (god)] {'started_at': datetime.datetime(2023, 8, 2, 7, 44, 14, tzinfo=tzlocal())}
ERROR: 'NoneType' object is not subscriptable
Found error
[Py (god)] test - entered main while loop
[Py (god test)] []
[Py (god)] test - entered for loop 2
[Py (god)] d://3c8aea61a5d763fdf2e8ea0c8f66a341e78b480817d38959f24e61b642ce4bd5
[Py (god)] []
[Py (dgod)] {}
[Py (god)] {'started_at': datetime.datetime(2023, 8, 2, 7, 44, 14, tzinfo=tzlocal())}
.
looped 2nd for loop 6 times
.
[Py (god)] test - entered for loop 2
[Py (god)] d://9e284a618aecb72fc52c9d2e3a4bb1f007f13aad0d761d0705e49f8773f03fd5
[Py (god)] []
[Py (dgod)] {}
[Py (god)] {'started_at': datetime.datetime(2023, 8, 2, 7, 44, 16, tzinfo=tzlocal())}
[Py (god)] test - entered main while loop
[Py (god test)] []
[Py (god)] test - entered for loop 2
[Py (god)] d://3c8aea61a5d763fdf2e8ea0c8f66a341e78b480817d38959f24e61b642ce4bd5
[Py (god)] []
[Py (dgod)] {}
[Py (god)] {'started_at': datetime.datetime(2023, 8, 2, 7, 44, 14, tzinfo=tzlocal())}
.
looped 2nd for loop 6 times
.
[Py (god)] test - entered for loop 2
[Py (god)] d://9e284a618aecb72fc52c9d2e3a4bb1f007f13aad0d761d0705e49f8773f03fd5
[Py (god)] []
[Py (dgod)] {}
[Py (god)] {'started_at': datetime.datetime(2023, 8, 2, 7, 44, 16, tzinfo=tzlocal())}

one note: it does not loop in the first for loop for some reason, as I placed a debug log there and it's not displayed.

@Nandinski
Copy link

Nandinski commented Aug 3, 2023

Hello, I'm working with @o0n1x trying to understand the issue.

After looking to the source code, it seems dependent on finding local containers using the docker python api. I saw evidence of this both in the bootstrapper as well as in the dashboard. However, because we're using kubernetes, containers deployed by kubernetes are not visible to the docker api, at least not by default. We can't seem to find a way to have them visible. Can you please help us make them visible, or propose an alternative?

@sebastiaoamaro
Copy link
Collaborator

Hi @o0n1x @Nandinski sorry for the late response, we are currently on holiday.
Do the other pods start? Can I see the logs of a pod (kubectl describe pods my-pod)?
Did you use minikube, as described in the docs? This is needed because of accessing local docker images.

@Nandinski
Copy link

We are not running the experiment with Minikube, we are trying to use a regular Kubernetes cluster by following the other documented suggestion with kubeadm. We picked this one because we plan to deploy Kollaps across a distributed multi-node cluster. But after reading your question asking if we are using Minikube, we tried to launch the experiment with Minikube again and noticed something interesting which differs from the kubeadm deployment. We believe that this difference is what causes the Minikube deployment to work, while the kubeadm deployment is not.

The difference is that Minikube exposes the Kubernetes containers through the docker API, but this does not happen with a normal Kubernetes cluster. By default, docker does not have access to Kubernetes deployed containers. From my understanding, Minikube seems to make them available through docker as a convenience. And unfortunately, from what I can tell, Kollaps' Kubernetes deployment seems to depend on docker to get access to Kubernetes deployed containers.

An example in the code where this is needed, is in the bootstrapper to bootstrap the dashboard. Before doing the dashboard bootstrap, this call:

for container in self.low_level_client.containers():

(self.low_level_client.containers()), tries to get all local containers using Docker in order to find the dashboard container. And while this works in Minikube, returning all the local containers, it does not work with a normal cluster. In a normal cluster, because docker has no view into the Kubernetes containers, that function call always returns an empty list, as the issue begins by saying.

When connected to Minikube's docker in a Minikube deployment, docker ps outputs:

docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED          STATUS          PORTS     NAMES
4109b9750a17   660f45b845fa                "/bin/sh -c 'mkfifo …"   53 seconds ago   Up 53 seconds             k8s_client3-ddad5428-6be6-465f-b2cb-8e9191787bd7_client3-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_26d1d5f6-da4a-4cb2-8d14-a10a73e45ee3_0
347d866890cf   e131dd9acaf1                "/bin/sh -c 'mkfifo …"   53 seconds ago   Up 53 seconds             k8s_server-ddad5428-6be6-465f-b2cb-8e9191787bd7_server-ddad5428-6be6-465f-b2cb-8e9191787bd7-cbfbd9bfb-9hv9v_default_b18833ff-abc1-43ed-abbc-3ff637107a9d_0
9637df6af153   e131dd9acaf1                "/bin/sh -c 'mkfifo …"   53 seconds ago   Up 53 seconds             k8s_server-ddad5428-6be6-465f-b2cb-8e9191787bd7_server-ddad5428-6be6-465f-b2cb-8e9191787bd7-cbfbd9bfb-6x248_default_4ac5d81f-aac9-4e82-8eca-b71475e8e3af_0
175928e96e47   e131dd9acaf1                "/bin/sh -c 'mkfifo …"   53 seconds ago   Up 53 seconds             k8s_server-ddad5428-6be6-465f-b2cb-8e9191787bd7_server-ddad5428-6be6-465f-b2cb-8e9191787bd7-cbfbd9bfb-6pjzk_default_566f483d-56d1-4f94-a7d7-a4b51b245630_0
844a075e8d63   660f45b845fa                "/bin/sh -c 'mkfifo …"   53 seconds ago   Up 53 seconds             k8s_client1-ddad5428-6be6-465f-b2cb-8e9191787bd7_client1-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_7e5a06f1-6805-41cb-b437-eefa7a83dd4b_0
4744a90ea2cd   a6f9492edeb7                "/bin/bash /dashboar…"   54 seconds ago   Up 53 seconds             k8s_dashboard-ddad5428-6be6-465f-b2cb-8e9191787bd7_dashboard-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_d810b7da-e50b-4377-86f7-bd4675d6a068_0
e302005e0be5   660f45b845fa                "/bin/sh -c 'mkfifo …"   54 seconds ago   Up 53 seconds             k8s_client2-ddad5428-6be6-465f-b2cb-8e9191787bd7_client2-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_12a5e18c-6550-4888-8388-e3bd2ce48570_0
83a2fb8f630b   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 53 seconds             k8s_POD_client3-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_26d1d5f6-da4a-4cb2-8d14-a10a73e45ee3_0
f9c87a1b47e4   0ac8df3aed9a                "/usr/bin/pid1 /usr/…"   54 seconds ago   Up 53 seconds             k8s_bootstrapper_bootstrapper-cdncb_default_4692387e-387c-4896-abd8-f8a9ecb3b586_0
61e5435e1eb0   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 53 seconds             k8s_POD_server-ddad5428-6be6-465f-b2cb-8e9191787bd7-cbfbd9bfb-6x248_default_4ac5d81f-aac9-4e82-8eca-b71475e8e3af_0
14b69c364345   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 53 seconds             k8s_POD_server-ddad5428-6be6-465f-b2cb-8e9191787bd7-cbfbd9bfb-9hv9v_default_b18833ff-abc1-43ed-abbc-3ff637107a9d_0
5eb286afa052   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 53 seconds             k8s_POD_server-ddad5428-6be6-465f-b2cb-8e9191787bd7-cbfbd9bfb-6pjzk_default_566f483d-56d1-4f94-a7d7-a4b51b245630_0
511363b83c81   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 53 seconds             k8s_POD_client2-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_12a5e18c-6550-4888-8388-e3bd2ce48570_0
b70221ba6fbe   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 53 seconds             k8s_POD_client1-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_7e5a06f1-6805-41cb-b437-eefa7a83dd4b_0
17fb3e466a31   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 53 seconds             k8s_POD_dashboard-ddad5428-6be6-465f-b2cb-8e9191787bd7_default_d810b7da-e50b-4377-86f7-bd4675d6a068_0
634643f02a66   registry.k8s.io/pause:3.9   "/pause"                 54 seconds ago   Up 54 seconds             k8s_POD_bootstrapper-cdncb_default_4692387e-387c-4896-abd8-f8a9ecb3b586_0
4e59daf9f57f   6e38f40d628d                "/storage-provisioner"   7 minutes ago    Up 7 minutes              k8s_storage-provisioner_storage-provisioner_kube-system_01c918e2-2bcb-4e64-af8e-fa0933e1288e_2
ab0ebc56b4b1   ead0a4a53df8                "/coredns -conf /etc…"   8 minutes ago    Up 8 minutes              k8s_coredns_coredns-5d78c9869d-ps8zs_kube-system_3ba73883-13e6-47ee-8bbc-7f818745cc8f_0
bd0636fd3d04   5780543258cf                "/usr/local/bin/kube…"   8 minutes ago    Up 8 minutes              k8s_kube-proxy_kube-proxy-5c9tk_kube-system_347398f4-9ed3-474a-8b13-114ba6b6f4b7_0
1907120cb2f4   registry.k8s.io/pause:3.9   "/pause"                 8 minutes ago    Up 8 minutes              k8s_POD_coredns-5d78c9869d-ps8zs_kube-system_3ba73883-13e6-47ee-8bbc-7f818745cc8f_0
1bf2190c2f47   registry.k8s.io/pause:3.9   "/pause"                 8 minutes ago    Up 8 minutes              k8s_POD_kube-proxy-5c9tk_kube-system_347398f4-9ed3-474a-8b13-114ba6b6f4b7_0
a8cd2c14f3a2   registry.k8s.io/pause:3.9   "/pause"                 8 minutes ago    Up 8 minutes              k8s_POD_storage-provisioner_kube-system_01c918e2-2bcb-4e64-af8e-fa0933e1288e_0
5622eb422678   08a0c939e61b                "kube-apiserver --ad…"   8 minutes ago    Up 8 minutes              k8s_kube-apiserver_kube-apiserver-minikube_kube-system_4e275e35949ad3fdfeb753c1099308e7_0
1859377018f9   86b6af7dd652                "etcd --advertise-cl…"   8 minutes ago    Up 8 minutes              k8s_etcd_etcd-minikube_kube-system_8af0e85a28544808d52bb7c47ad824ed_0
f54f91877ee1   41697ceeb70b                "kube-scheduler --au…"   8 minutes ago    Up 8 minutes              k8s_kube-scheduler_kube-scheduler-minikube_kube-system_e14e2f92c469337ac62a252dad99dcc5_0
01d93194983b   7cffc01dba0e                "kube-controller-man…"   8 minutes ago    Up 8 minutes              k8s_kube-controller-manager_kube-controller-manager-minikube_kube-system_e33f7a2a0d6aad5df18c7258d3116e25_0
0ee06e889ccb   registry.k8s.io/pause:3.9   "/pause"                 8 minutes ago    Up 8 minutes              k8s_POD_kube-apiserver-minikube_kube-system_4e275e35949ad3fdfeb753c1099308e7_0
843d0b44fb65   registry.k8s.io/pause:3.9   "/pause"                 8 minutes ago    Up 8 minutes              k8s_POD_etcd-minikube_kube-system_8af0e85a28544808d52bb7c47ad824ed_0
1cad515ab7c7   registry.k8s.io/pause:3.9   "/pause"                 8 minutes ago    Up 8 minutes              k8s_POD_kube-scheduler-minikube_kube-system_e14e2f92c469337ac62a252dad99dcc5_0
1758c83a725b   registry.k8s.io/pause:3.9   "/pause"                 8 minutes ago    Up 8 minutes              k8s_POD_kube-controller-manager-minikube_kube-system_e33f7a2a0d6aad5df18c7258d3116e25_0

When connected to a master node's docker in a kubeadm deployment, with the pods running, docker ps is empty:

docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

I'm not sure if this container visibility is a setting we can configure in docker, but from what I could find, it does not seem to be possible to have docker show the Kubernetes containers as it happens with Minikube. Are you using a docker setting that allows it to see the Kubernetes pods?

To give more details on our setup, we are running Docker version 24.0.5, Kubernetes client/server version 1.27. We are trying to deploy the iper3 example in a Kubernetes cluster started with kubeadm.
Images used:
- kollaps:2.0
- dashboard:1.0
- iper3-client:1.0
- iper3-server:1.0
We pushed the images to Dockerhub so that Kubernetes could pull them.
Using the previous setup, we have all the pods in a running state, waiting for the experiment to start.

kubectl get pods
NAME                                                           READY   STATUS    RESTARTS   AGE
bootstrapper-wfnnv                                             1/1     Running   0          14s
client1-ddad5428-6be6-465f-b2cb-8e9191787bd7                   1/1     Running   0          14s
client2-ddad5428-6be6-465f-b2cb-8e9191787bd7                   1/1     Running   0          14s
client3-ddad5428-6be6-465f-b2cb-8e9191787bd7                   1/1     Running   0          14s
dashboard-ddad5428-6be6-465f-b2cb-8e9191787bd7                 1/1     Running   0          14s
server-ddad5428-6be6-465f-b2cb-8e9191787bd7-6d7c6f88c4-9d8vc   1/1     Running   0          14s
server-ddad5428-6be6-465f-b2cb-8e9191787bd7-6d7c6f88c4-wvhwk   1/1     Running   0          14s
server-ddad5428-6be6-465f-b2cb-8e9191787bd7-6d7c6f88c4-zb5v9   1/1     Running   0          14s

At this point we try to reach the dashboard but can't because it is waiting for the bootstrapper to start it using nsenter.
The boostrapper does not start the dashboard because it is waiting to find the local dashboard container using the docker api
It never gets it, because the docker API is not able to see the Kubernetes containers in a normal Kubernetes cluster (at least by default).
Here are the logs of the boostraper (all other pods don't have logs):

kubectl logs bootstrapper-wfnnv
[Py (Bootstrapper)] Kubernetes bootstrapping started...
[Py (Bootstrapper)] bootstrapping all containers with label ddad5428-6be6-465f-b2cb-8e9191787bd7.
[Py (god)] found god_id.
[Py (god)] ip: (not yet known), nr. of gods: 1
[Py (god1)] 10.85.169.0 :: ['HELLO', '244899788002549520994656052955328513698']
[Py (god2)] 10.85.169.0 :: READY
[Py (god)] ip: 10.85.169.0, nr. of gods: 1
[Py (god)] local IP: 10.85.169.0,
           remote IPs:
[Py (god)] resolved all IPs
[Py (god)] started rust handler.
ERROR: 'NoneType' object is not subscriptable

Not sure what this error at the end of the log is, but it only happens once and resumes normal behavior in the while true loop, as seen in the logs of the initial message.

To summarize our issue, does the current Kubernetes support only work with Minikube? If not, how can we make the Kubernetes deployment work with kubeadm? Please let us know if you need extra information. Thank you for the help.

@sebastiaoamaro
Copy link
Collaborator

Thanks for the detailed description! As it stands we need minikube for the reasons you mentioned.
To make it work with kubeadm we need an alternative way of finding the PID of the containers associated with the experiment (and match each with their role in the experiments iperf3clients/servers/dashboard etc.)

I will look into this and try to find a solution.

@Nandinski
Copy link

Thank you for the reply.
We appreciate you looking into how to make Kollaps work with Kubernetes outside a Minikube deployment. Please let us know once it is possible.
Perhaps you're looking into this, but one possible solution to get the PID without docker is to directly get it from crictl - from what I understand, it's the standard way that Kubernetes interacts with container runtimes. We can use it to get the container name and the PID:
https://serverfault.com/questions/1055159/how-to-find-out-pid-of-the-container-using-crictl
Sadly there doesn't seem to be a Python API to interact with crictl. It's only a CLI tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants