-
Notifications
You must be signed in to change notification settings - Fork 9
Backup
K8ssandra includes Medusa for Apache Cassandra™ to handle backup and restore for your Cassandra nodes. Recently Medusa was upgraded to introduce support for all S3 compatible backends, including MinIO, the popular k8s-native object storage suite. Let’s see how to set up K8ssandra and MinIO to backup Cassandra in just a few steps.
Similar to K8ssandra, MinIO can be simply deployed through Helm.
helm repo add minio https://helm.min.io/
The MinIO Helm charts allow you to do several things at once at install time:
- Set the credentials to access MinIO
- Create a bucket for your backups that can be set as default
You can create a k8ssandra-medusa bucket and use minio_key/minio_secret as the credentials, and deploy MinIO in a new namespace called minio by running the following command:
helm install --set accessKey=minio_key,secretKey=minio_secret,defaultBucket.enabled=true,defaultBucket.name=k8ssandra-medusa minio minio/minio -n minio --create-namespace
After the helm install
command has completed, you should see something similar to this in the minio namespace:
$ kubectl get all -n minio
NAME READY STATUS RESTARTS AGE
pod/minio-5fd4dd687-gzr8j 1/1 Running 0 109s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/minio ClusterIP 10.96.144.61 <none> 9000/TCP 109s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/minio 1/1 1 1 109s
NAME DESIRED CURRENT READY AGE
replicaset.apps/minio-5fd4dd687 1 1 1 109s
Now that MinIO is up and running, you can install Medusa and create a secret to access the bucket. Create a medusa_secret.yaml
file with the following content:
apiVersion: v1
kind: Secret
metadata:
name: medusa-bucket-key
type: Opaque
stringData:
# Note that this currently has to be set to medusa_s3_credentials!
medusa_s3_credentials: |-
[default]
aws_access_key_id = minio_key
aws_secret_access_key = minio_secret
Now apply the file: kubectl apply -f medusa_secret.yaml
You should now see the medusa-bucket-key secret:
$ kubectl get secrets -n k8ssandra
NAME TYPE DATA AGE
default-token-twk5w kubernetes.io/service-account-token 3 4m49s
medusa-bucket-key Opaque 1 45s
You can then deploy Medusa. Add this part to the k8ssamdra.yaml
file and upgrade the deployment:
medusa:
enabled: true
storage: s3_compatible
storage_properties:
host: minio.minio.svc.cluster.local
port: 9000
secure: "False"
bucketName: k8ssandra-medusa
storageSecret: medusa-bucket-key
helm upgrade k8ssandra k8ssandra/k8ssandra -f k8ssandra.yaml
Extract the username and password to access Cassandra into variables:
username=$(kubectl get secret k8ssandra-superuser -o jsonpath="{.data.username}" | base64 --decode)
password=$(kubectl get secret k8ssandra-superuser -o jsonpath="{.data.password}" | base64 --decode)
Connect through CQLSH on one of the nodes:
kubectl exec -it k8ssandra-dc1-default-sts-0 -c cassandra -- cqlsh -u $username -p $password
Copy/paste the following statements into the CQLSH prompt and press enter:
CREATE KEYSPACE medusa_test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
USE medusa_test;
CREATE TABLE users (email TEXT PRIMARY KEY, name TEXT, state TEXT);
INSERT INTO users (email, name, state) VALUES ('alice@example.com', 'Alice Smith', 'TX');
INSERT INTO users (email, name, state) VALUES ('bob@example.com', 'Bob Jones', 'VA');
INSERT INTO users (email, name, state) VALUES ('carol@example.com', 'Carol Jackson', 'CA');
INSERT INTO users (email, name, state) VALUES ('david@example.com', 'David Yang', 'NV');
Check that the rows were properly inserted:
SELECT * FROM medusa_test.users;
email | name | state
-------------------+---------------+-------
alice@example.com | Alice Smith | TX
bob@example.com | Bob Jones | VA
david@example.com | David Yang | NV
carol@example.com | Carol Jackson | CA
(4 rows)
Now backup this data. To that end, use the following command:
helm install my-backup k8ssandra/backup --set name=backup1,cassandraDatacenter.name=dc1
Since the backup operation is asynchronous, you can monitor its completion by running the following command:
kubectl get cassandrabackup backup1 -n k8ssandra -o jsonpath={.status.finishTime}
As long as this doesn’t output a date and time, then the backup is still running. With the amount of data present and the fact that you’re using a locally accessible backend, this should complete quickly.
TRUNCATE the table and verify it is empty:
$ kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u $username -p $password
TRUNCATE medusa_test.users;
SELECT * FROM medusa_test.users;
email | name | state
-------+------+-------
(0 rows)
Now restore the backup taken previously:
helm install restore-test k8ssandra/restore --set name=restore-backup1,backup.name=backup1,cassandraDatacenter.name=dc1
This operation will take a little longer as it requires to stop the StatefulSet pod and perform the restore as part of the init containers, before the Cassandra container can start. You can monitor progress using this command:
watch -d kubectl get cassandrarestore restore-backup1 -o jsonpath={.status} -n k8ssandra
The restore operation is fully completed once the finishTime value appears in the output:
{"finishTime":"2021-03-23T13:58:36Z","restoreKey":"83977399-44dd-4752-b4c4-407273f0339e","startTime":"2021-03-23T13:55:35Z"}
Check that you can read the data from the previously truncated table:
kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u $username -p $password
SELECT * FROM medusa_test.users
email | name | state
-------------------+---------------+-------
alice@example.com | Alice Smith | TX
bob@example.com | Bob Jones | VA
david@example.com | David Yang | NV
carol@example.com | Carol Jackson | CA
(4 rows)
You’ve successfully restored your lost data in just a few commands!
Proceed to the Step VIII
Got questions? Ask us using discord chat or a community forum!