DOKS (DigitalOcean K8s) based Cassandra Cluster as part of the DO (DigitalOcean) K8s challenge (https://www.digitalocean.com/community/pages/kubernetes-challenge).
Project Name: doks-cassandra
Repo: https://github.com/marchesir/doks-cassandra
To provide scalable HA (High Availability) Cassndra NoSQL K8s based the following components are required:
- DOKS Cluster:
a. Defaut cluster with 3 nodes;
b. Minimum of 4GB RAM and 2 CPU per node otherwise Cassandra fails to startup due to limit settings; - K8s statefulset (https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) required so state is mainted;
- K8s service with ClusterIP set to None required for DNS lookup within cluster for Cassandra;
- K8s StorageClass with volumeBindingMode set to WaitForFirstConsumer needed for dynamic persistent storage used to map Cassandra data which guarantees data will be preserved when scaling up/down;
-
Following default environment vars are set which can be overriden if needed:
export DO_REGION=ams3
export DO_SIZE=s-2vcpu-4gb (dont set any smaller) -
Set name of DOKS and access token as such:
export DOKS_NAME=myk8s
export DO_ACCESS_TOKEN=mytoken -
Make sure all .sh are executable:
chmod +x *.shRun create script:
./doks_create.sh (can take up to 10 mins) -
Verify cluster with kubectl get nodes -o wide command and the nodes and there properties will be displayed, e.g.:
k8scassandra-default-pool-u6zte v1.21.5 10.110.0.3 164.92.220.139 containerd://1.4.11 -
First lets install the K8s Cassandra service with kubectl apply -f cassandra-service.yml and verify with kubectl get service cassandra:
cassandra ClusterIP None 9042/TCP 43s
Note: CLUSTER_IP and EXTERNAL_IP are all empty as this serivice is needed for DNS loopkup by Cassandra. -
This step is very important as we need to create new SotrageClass and patch it so it becomes the default:
-
Create SotrageClass fast with kubectl apply -f st.yml and verify all SotrageClass with kubectl get sc:
do-block-storage (default) dobs.csi.digitalocean.com Delete Immediate true 16m
fast dobs.csi.digitalocean.com Delete WaitForFirstConsumer true 23sAs can be seen WaitForFirstConsumer is set on the fast storageclass.
-
To make fast the default run the following 2 commands:
kubectl patch storageclass do-block-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
kubectl patch storageclass fast -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' -
Verify the fast SotrageClass is now default with kubectl get sc:
do-block-storage dobs.csi.digitalocean.com Delete Immediate true 20m
fast (default) dobs.csi.digitalocean.com Delete WaitForFirstConsumer true 5m2s
-
-
Now all components are installed leaving the Cassandra statefulset which includes the following:
- Google docker image with Cassandra and required tools;
- CPU/Memory limits of 0.25% CPU and 1Gi RAM;
- Dynamic Persistent Storage to map Cassandra Data;
- Configuration of Cassandra DataCenter/Ring with 2 inital pods;
-
Install with kubectl apply -f cassandra-statefulset.yml, this part can take sometime to spinup:
- Pods may report "readiness probe failed" error, but actually Cassandra is ok;
- Verify K8s events are all ok with kubectl get events --sort-by=.metadata.creationTimestamp;
- Next lets check the statefulset with kubectl get statefulset cassandra:
cassandra 2/2 7m - Now lets verify the pods with kubectl get pods:
cassandra-0 1/1 Running 0 8m11s
cassandra-1 1/1 Running 0 7m14s - If desired the Cassandra raw logs can be tailed as such per pod:
kubectl logs -f cassandra-0
-
Verify Cassandra Cluster (DC/Ring) is running with following command on any Cassandra Node/Pod:
kubectl exec -it cassandra-0 -- nodetool status
Datacenter: DC1-Cassandra1
==========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.244.0.85 65.81 KiB 32 100.0% 0d8fedd1-ee5a-49b4-9ffa-0a0efcd291ac Rack1-Cassandra1
UN 10.244.1.235 104.55 KiB 32 100.0% b7b22f5b-f549-4f05-9c44-db9df44cd52c Rack1-Cassandra1This shows all is up and running.
-
Verify each pod has its persistent storage created with kubectl get pv:
pvc-6f1047eb-485d-4390-8b65-b54382a248bc 1Gi RWO Delete Bound default/cassandra-data-cassandra-0 fast
pvc-d9f6b30a-261c-46ac-b4d1-9953e043e4d0 1Gi RWO Delete Bound default/cassandra-data-cassandra-1 fast
As can be seen 2 1Gi persistent storage disks have been created per Pod. -
Test scaling with the following command to go from 2 pods to 3, kubectl scale statefulsets cassandra --replicas=3, verify with kubectl get pods or kubectl get statefulset cassandra and then verify Cassandra Cluster state with kubectl exec -it cassandra-0 -- nodetool status, show here below:
cassandra 3/3 77mDatacenter: DC1-Cassandra1
==========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.244.0.85 98.49 KiB 32 60.7% 0d8fedd1-ee5a-49b4-9ffa-0a0efcd291ac Rack1-Cassandra1
UN 10.244.1.163 84.81 KiB 32 66.1% ae8b8cee-e403-4f90-be58-2ad52f221997 Rack1-Cassandra1
UN 10.244.1.235 135 KiB 32 73.2% b7b22f5b-f549-4f05-9c44-db9df44cd52c Rack1-Cassandra1Finally rerun scale command and set back to 2 and pods will be reduced back to 2, in this case Cassandra will drain the data from the dieing Pod and push it to the remianing live Pods. Running kubectl get pv shows the persistent storage of the dead pod is still present as can be seen below
cassandra-0 1/1 Running 0 81m cassandra-1 1/1 Running 0 80mAs can be seen we are back to 2 pods, cassandra-2 has been deleted, below is the persistent disks and as can be seen cassandra-2 is still present
pvc-6f1047eb-485d-4390-8b65-b54382a248bc 1Gi RWO Delete Bound default/cassandra-data-cassandra-0 fast
pvc-cc6578f7-b09e-4db3-aabb-d7a8a1919ebd 1Gi RWO Delete Bound default/cassandra-data-cassandra-2 fast
pvc-d9f6b30a-261c-46ac-b4d1-9953e043e4d0 1Gi RWO Delete Bound default/cassandra-data-cassandra-1 fast
11. To cleanup run ./doks_delete.sh
There are many improvements required to make this more "production ready":
- Add dedicated namespace combined with RBAC for better management/security;
- Add HPA (Horizontal Pod AutoScaler) to automatically size the cluster better based on resources;
- Tweak node size to best fit Cassandra needs;
- Create custom Dockerfile to better fit needs;
- Fix health check error "readiness probe failed" by understanding why Cassandra causes K8s to fail and add maybe CRD or simular;
- Package all via helm or even kustomize as well as using Terraform/Pulumi for better infra automation;
- Add routing vai ingress or simular so Cassandra can be accessed externally with sifficent firewall rules or/and ingress/egress rules;