Gamified Chaos Engineering Tool for K8s
This project is part of landscape of Cloud Native Computing Foudation in the Observability and Analysis - Chaos Engineering section.
- Launch the demo at this link https://kubeinvaders.platformengineering.it
- Monitor the pod status here https://kubeopsview.platformengineering.it
Backed by the teams at platformengineering.it and devopstribe.it, which provides enterprise-grade features and certified resilience services for your Kubernetes infrastructure.
We have embedded a demo on the DevOpsTRibe blog for you to try out the tool.
Here are the slides from the Chaos Engineering speech I prepared for FOSDEM 2023. Unfortunately, I could not be present at my talk, but I would still like to share them with the community.
- Description
- Installation
- Usage
- Architecture
- Persistence
- Generic Troubleshooting & Known Problems
- Troubleshooting Unknown Namespace
- Metrics
- Security
- Community
- Community blogs and videos
- License
With k-inv, you can stress a K8s cluster in a fun way and check how resilient it is.
Before you start, you need a token from a service account that has this clusterrole.
Create the required components (assumes k8s v1.24+):
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: kubeinvaders
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kinv-cr
rules:
- apiGroups:
- ""
resources:
- pods
- pods/log
verbs:
- delete
- apiGroups:
- batch
- extensions
resources:
- jobs
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- "*"
resources:
- "*"
verbs:
- get
- watch
- list
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kinv-sa
namespace: kubeinvaders
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kinv-crb
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kinv-cr
subjects:
- kind: ServiceAccount
name: kinv-sa
namespace: kubeinvaders
---
apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
name: kinv-sa-token
namespace: kubeinvaders
annotations:
kubernetes.io/service-account.name: kinv-sa
EOF
Extract the token:
TOKEN=$(k get secret -n kubeinvaders -o go-template='{{.data.token | base64decode}}' kinv-sa-token)
Run the container:
podman run -p 3131:8080 \
--env K8S_TOKEN=$TOKEN \
--env ENDPOINT=localhost:3131 \
--env INSECURE_ENDPOINT=true \
--env KUBERNETES_SERVICE_HOST=10.10.10.4 \
--env KUBERNETES_SERVICE_PORT_HTTPS=6443 \
--env NAMESPACE=namespace1,namespace2 \
luckysideburn/kubeinvaders:latest
Given this example, you can access k-inv at the following address: http://localhost:3131
- Please pay attention to the command "podman run -p 3131:8080". Forwarding port 8080 is important.
- We suggest using
INSECURE_ENDPOINT=true
for local development environments. - Follow the instructions above to create the token for
K8S_TOKEN
. - In the example, we use image tag
latest
, uselatest_debug
for debugging.
These are the permissions your service account must have. You can take an example from this clusterrole.
- apiGroups: [""] resources: ["pods", "pods/log"] verbs: ["delete"]
- apiGroups: ["batch", "extensions"] resources: ["jobs"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""] resources: [""] verbs: ["get", "watch", "list"]
Host and port of the web console.
Select HTTP or HTTPS for the web console.
IP address or DNS name of your control plane.
TCP port of the target control plane.
List the namespaces you want to stress or on which you want to see logs (logs are a beta feature, they might not work or could slow down the browser...).
docker run -p 8080:8080 \
--env K8S_TOKEN=<k8s_service_account_token> \
--env ENDPOINT=localhost:8080 \
--env INSECURE_ENDPOINT=true \
--env KUBERNETES_SERVICE_HOST=<k8s_controlplane_host> \
--env KUBERNETES_SERVICE_PORT_HTTPS=<k8s_controlplane_port> \
--env NAMESPACE=<comma_separated_namespaces_to_stress> \
luckysideburn/kubeinvaders:develop
If you need a lab kubernetes cluster you can use this setup via Make and Minikube. Follow this readme
helm repo add kubeinvaders https://lucky-sideburn.github.io/helm-charts/
helm repo update
kubectl create namespace kubeinvaders
helm install kubeinvaders --set-string config.target_namespace="namespace1\,namespace2" \
-n kubeinvaders kubeinvaders/kubeinvaders --set ingress.enabled=true --set ingress.hostName=kubeinvaders.io --set deployment.image.tag=latest
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable traefik" sh -s -
cat >/tmp/ingress-nginx.yaml <<EOF
apiVersion: v1
kind: Namespace
metadata:
name: ingress-nginx
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: ingress-nginx
namespace: kube-system
spec:
chart: ingress-nginx
repo: https://kubernetes.github.io/ingress-nginx
targetNamespace: ingress-nginx
version: v4.9.0
set:
valuesContent: |-
fullnameOverride: ingress-nginx
controller:
kind: DaemonSet
hostNetwork: true
hostPort:
enabled: true
service:
enabled: false
publishService:
enabled: false
metrics:
enabled: false
serviceMonitor:
enabled: false
config:
use-forwarded-headers: "true"
EOF
kubectl create -f /tmp/ingress-nginx.yaml
kubectl create ns namespace1
kubectl create ns namespace2
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
helm install kubeinvaders --set-string config.target_namespace="namespace1\,namespace2" \
-n kubeinvaders kubeinvaders/kubeinvaders --set ingress.enabled=true --set ingress.hostName=kubeinvaders.io --set deployment.image.tag=latest
helm install kubeinvaders --set-string config.target_namespace="namespace1\,namespace2" -n kubeinvaders kubeinvaders/kubeinvaders --set ingress.enabled=true --set ingress.hostName=kubeinvaders.local --set deployment.image.tag=latest --set service.type=LoadBalancer --set service.port=80
kubectl set env deployment/kubeinvaders INSECURE_ENDPOINT=true -n kubeinvaders
oc adm policy add-scc-to-user anyuid -z kubeinvaders
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: kubeinvaders
namespace: "kubeinvaders"
spec:
host: "kubeinvaders.io"
to:
name: kubeinvaders
tls:
termination: Edge
cat >deployment.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 20 # tells deployment to run 20 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.24.0
ports:
- containerPort: 81
EOF
Apply Nginx Deployment in namespace1 and namespace2
sudo kubectl apply -f deployment.yaml -n namespace1
sudo kubectl apply -f deployment.yaml -n namespace2
At the top you will find some metrics as described below:
Current Replicas State Delay is a metric that show how much time the cluster takes to come back at the desired state of pods replicas.
This is a control-plane you can use to switch off & on various features.
Press the "Start" button to initiate the automatic pilot (the button changes to "Stop" to disable this feature).
Press the "Enable Shuffle" button to randomly rearrange the positions of pods or K8s nodes (the button changes to "Disable Shuffle" to deactivate this feature).
Press the "Auto NS Switch" button to randomly switch between namespaces (the button changes to "Disable Auto NS Switch" to deactivate this feature).
Press the "Hide Pods Name" button to conceal the names of the pods beneath the aliens (the button changes to "Show Pods Name" to deactivate this feature).
As described below, on the game screen near the spaceship, there are details about the current cluster, namespace, and some configurations.
Under the + and - buttons, a bar appears with the latest game events.
Press 'h' or select 'Show Special Keys' from the menu.
Press the + or - buttons to increase or decrease the game screen.
-
Select "Show Current Chaos Container for Nodes" from the menu to see which container starts when you attack a worker node (not an alien, they are pods).
-
Select "Set Custom Chaos Container for Nodes" from the menu to use your preferred image or configuration against nodes.
K-inv uses Redis to save and manage data. Redis is configured with "appendonly."
Currently, the Helm chart does not support PersistentVolumes, but this task is on the to-do list...
- It seems that KubeInvaders does not work with EKS due to problems with ServiceAccount.
- Currently, the installation of KubeInvaders into a namespace that is not named "kubeinvaders" is not supported.
- I have only tested KubeInvaders with a Kubernetes cluster installed through KubeSpray.
- If you don't see aliens, please follow these steps:
- Open a terminal and run "kubectl logs <pod_of_kubeinvader> -n kubeinvaders -f"
- Execute the following command from another terminal:
curl "https://<your_kubeinvaders_url>/kube/pods?action=list&namespace=namespace1" -k
- Open an issue with attached logs.
- Check if the namespaces declared with helm config.target_namespace (e.g., config.target_namespace="namespace1,namespace2") exist and contain some pods.
- Check your browser's developer console for any failed HTTP requests (send them to luckysideburn[at]gmail[dot]com or open an issue on this repo).
- Try using latest_debug and send logs to luckysideburn[at]gmail[dot]com or open an issue on this repo.
KubeInvaders exposes metrics for Prometheus through the standard endpoint /metrics.
Here is an example of Prometheus configuration:
scrape_configs:
- job_name: kubeinvaders
static_configs:
- targets:
- kubeinvaders.kubeinvaders.svc.cluster.local:8080
Example of metrics:
Metric | Description |
---|---|
chaos_jobs_node_count{node=workernode01} | Total number of chaos jobs executed per node |
chaos_node_jobs_total | Total number of chaos jobs executed against all worker nodes |
deleted_pods_total 16 | Total number of deleted pods |
deleted_namespace_pods_count{namespace=myawesomenamespace} | Total number of deleted pods per namespace |
In order to restrict the access to the Kubeinvaders endpoint add this annotation into the ingress.
nginx.ingress.kubernetes.io/whitelist-source-range: <your_ip>/32
Please reach out for news, bugs, feature requests, and other issues via:
- On Twitter: @kubeinvaders & @luckysideburn
- New features are published on YouTube too in this channel
- AdaCon Norway Live Stream
- LILiS - Linux Day 2023 Benevento
- Kubernetes.io blog: KubeInvaders - Gamified Chaos Engineering Tool for Kubernetes
- acloudguru: cncf-state-of-the-union
- DevNation RedHat Developer: Twitter
- Flant: Open Source solutions for chaos engineering in Kubernetes
- Reeinvent: KubeInvaders - gamified chaos engineering
- Adrian Goins: K8s Chaos Engineering with KubeInvaders
- dbafromthecold: Chaos engineering for SQL Server running on AKS using KubeInvaders
- Pklinker: Gamification of Kubernetes Chaos Testing
- Openshift Commons Briefings: OpenShift Commons Briefing KubeInvaders: Chaos Engineering Tool for Kubernetes
- GitHub: awesome-kubernetes repo
- William Lam: Interesting Kubernetes application demos
- The Chief I/O: 5 Fun Ways to Use Kubernetes
- LuCkySideburn: Talk @ Codemotion
- Chaos Carnival: Chaos Engineering is fun!
- Kubeinvaders (old version) + OpenShift 4 Demo: YouTube_Video
- KubeInvaders (old version) Vs Openshift 4.1: YouTube_Video
- Chaos Engineering for SQL Server | Andrew Pruski | Conf42: Chaos Engineering: YouTube_Video
- nicholaschangblog: Introducing Azure Chaos Studio
- bugbug: Chaos Testing: Everything You Need To Know
KubeInvaders is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.