kube-monkey is a tool to test the resiliency of the system. It deletes random pods repeatedly at specific intervals.
This is a tool inspired from the Chaos Monkey. This simply kills the random pods in the Kubernetes cluster. There are few ways to control which pods can be killed and at what intervals etc. Those are described below.
The tool is written in go and uses official client-go library. And to expose the health check API it uses mux project.
Once you clone the repo, run the below command at the root of the repo.
make dep
This installs all the dependencies of the repo. Note that this project does not use the dependency management or vendoring yet. So the behaviour might be different for you. Dependency management will be added soon in future.
To build a static binary for linux systems run the below command at the root if the repository.
make build
This creates a binary called kube-monkey
at the root of the repository.
This tool is built assuming that it would be running inside the kubernetes cluster as a pod. So it is important that the contianer is running inside a kubernetes pod for authenticating with the Kubernetes API server.
And since it discovers and deletes other pods, it needs to be running with
proper serviceaccounts
with required permissions. To create the required
service accounts with required permissions, run the below command.
Please note that below command assumes that there is a Kubernetes cluster
running and kubectl
is configured to communicate with the cluster.
kubectl create -f k8s-deploy/rbac.yaml
Above comman creates below resources in the Kubernetes cluster
- A
ClusterRole
with the namekube-monkey
- A
ServiceAccount
with the namekube-monkey
- A
ClusterRoleBinding
binding these in namespacedefault
And then to deploy kube-monkey
as a kubernetes deployment, run the below
command. And note that the image is pulled from the docker
repo msvbhat/kube-monkey
. If you have built another docker image probaly
with custom built binary, please update it
in the file.
kubectl create -f k8s-deploy/kube-monkey.yaml
By default the 50% of the pods are killed every 2 minutes. The pods running
in kube-system
namespaces are whitelisted by default. To control this
behaviour, please use the below env variables in the deployment manifest.
-
NAMESPACE_WHITELIST
- This is a space seperated list of Namespaces that should be whitelisted from killing pods. That means the pods running in these namespaces will not be considered for deleting. And the namespace kube-system is always whitelisted even if not specified in the list. -
DELETE_PERCENTAGE
- This is the percentage of pods that should be deleted. To not delete any pod, specify 0 and to delete all pods specify 100. But note that this percentage is applied to the pods that are eligible for deletion i.e. this percentage is applied to the pods that are not running in whitelisted namespaces. -
KM_SCHEDULE
- This is the schedule for kube monkey to delete pods. This follows the cron syntax. To understand more about the cron syntax that is allowed, please check docs
This has been only tested with the minikube. But is supposed to run in any Kubernetes cluster.
The project doesn't have unit tests yet. Unit tests will be added soon.
Currently the /metrics endpoint is a dummy endpoint. It doesn't return any metrics but only returns 200 OK.
- Define metrics for exporting and export them through metrics endpoint
- Add sophisticated method of specifying pods with specific labels etc
- Also add blacklisting namespaces
- Send events to pods for visibility
- Use cli args instead of env variables
- More options to delete pods on specific nodes only etc