Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INCIDENT-001: postgres doesnt have Point of Recoverability so it might fail when there's no replicas left #72

Merged
merged 4 commits into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions services/databases/postgresql/cnpg-backup-secrets.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
apiVersion: v1
kind: Secret
metadata:
name: cnpg-backup-secret
namespace: pg
type: Opaque
stringData:
ACCESS_KEY_ID: <FILL-IN>
ACCESS_SECRET_KEY: <FILL-IN>
39 changes: 37 additions & 2 deletions services/databases/postgresql/cnpg-cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: cnpg-cluster
namespace: pg
spec:
instances: 3

Expand Down Expand Up @@ -35,5 +36,39 @@ spec:
name: sinf-website-2023-secret

storage:
size: 10Gi
storageClass: longhorn-strict-local-retain
size: 20Gi
#backups are handled by cloudnative postgres
storageClass: longhorn-strict-local-no-backup

postgresql:
parameters:
max_slot_wal_keep_size: "10GB"

backup:
barmanObjectStore:
destinationPath: s3://niployments-postgres-backup/
endpointURL: https://52d22ed664e31a094229250acd87ccfb.eu.r2.cloudflarestorage.com
s3Credentials:
accessKeyId:
name: cnpg-backup-secret
key: ACCESS_KEY_ID
secretAccessKey:
name: cnpg-backup-secret
key: ACCESS_SECRET_KEY
wal:
compression: gzip
retentionPolicy: "15d"
---
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: cluster-backup-object-store
namespace: pg
spec:
cluster:
name: cnpg-cluster
method: barmanObjectStore
#Run on sundays, tuesdays and thursdays
schedule: '0 0 0 * * 0,2,4'
backupOwnerReference: cluster
immediate: true
4 changes: 4 additions & 0 deletions services/databases/postgresql/cnpg-secrets.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ stringData:
kind: Secret
metadata:
name: tts-secret
namespace: pg
type: kubernetes.io/basic-auth
---
apiVersion: v1
Expand All @@ -15,6 +16,7 @@ stringData:
kind: Secret
metadata:
name: ni-secret
namespace: pg
type: kubernetes.io/basic-auth
---
apiVersion: v1
Expand All @@ -24,6 +26,7 @@ stringData:
kind: Secret
metadata:
name: plausible-secret
namespace: pg
type: kubernetes.io/basic-auth
---
apiVersion: v1
Expand All @@ -33,4 +36,5 @@ stringData:
kind: Secret
metadata:
name: sinf-website-2023-secret
namespace: pg
type: kubernetes.io/basic-auth
3 changes: 2 additions & 1 deletion services/databases/postgresql/deploy-cnpg-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,11 @@ port=5432 # Define the desired port here
cnpg_dir='./services/databases/postgresql'
pods=$(cat $cnpg_dir/cnpg-cluster.yaml | awk '{if ($1 == "instances:") print $2}')

kubectl apply --server-side -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.22/releases/cnpg-1.22.2.yaml
kubectl apply --server-side --force-conflicts -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.23/releases/cnpg-1.23.2.yaml
kubectl wait --for=condition=available=true -n cnpg-system deployment/cnpg-controller-manager --timeout=120s

kubectl create namespace pg
kubectl apply -f $(dirname $0)/cnpg-backup-secrets.yaml -n pg
kubectl apply -f $(dirname $0)/cnpg-secrets.yaml -n pg
kubectl apply -f $(dirname $0)/cnpg-cluster.yaml -n pg
sleep 5 # Wait a little bit for first pod to be created
Expand Down
5 changes: 4 additions & 1 deletion services/databases/postgresql/deploy-cnpg-prod.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,13 @@

pods=$(cat $(dirname $0)/cnpg-cluster.yaml | awk '{if ($1 == "instances:") print $2}')

kubectl apply --server-side -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.22/releases/cnpg-1.22.2.yaml
# NOTE(luisd): https://cloudnative-pg.io/documentation/1.23/installation_upgrade/#server-side-apply-of-manifests
# they recommend force conflicts because such errors might happend when upgrading the controler
kubectl apply --server-side --force-conflicts -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.23/releases/cnpg-1.23.2.yaml
kubectl wait --for=condition=available=true -n cnpg-system deployment/cnpg-controller-manager --timeout=120s

kubectl create namespace pg
kubectl apply -f $(dirname $0)/cnpg-backup-secrets.yaml -n pg
kubectl apply -f $(dirname $0)/cnpg-secrets.yaml -n pg
kubectl apply -f $(dirname $0)/cnpg-cluster.yaml -n pg
sleep 5 # Wait a little bit for first pod to be created
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: longhorn-strict-local-no-backup
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: "Delete"
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "1"
staleReplicaTimeout: "720"
fromBackup: ""
fsType: "ext4"
dataLocality: "strict-local"
replicaAutoBalance: "ignored"
# diskSelector: "ssd,fast"
# nodeSelector: "storage,fast"
Loading