Recovering from node failures #10745
-
Hello, My team and I were simulating a node failure by cordoning a node and force killing the Kafka broker pod on it, but Kubernetes was unable to schedule the pod on other nodes because of the PVC. The KafkaNodePool is configured to use a storage class that relies on local persistent volumes that have a node affinity. Snippet of
Workaround:
In the event of an actual node failure, we would not want to delete the PVC until the cluster were recovered with all partitions replicated so that we can recover data from the disk if necessary. We weren’t able to reschedule the broker pod on another node without first deleting the PVC - what is the recommended action in this situation? *I saw a similar thread (though the brokers there used ephemeral storage) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
You have to delete the PVC. However, the PVC just provides a link between the Pod and the PV. So, deleting the PVC does not mean deleting your PV / your data. If you configure your storage class to retain the PVs when PVCs are deleted, you will keep the PV and the data and you can use them if needed. |
Beta Was this translation helpful? Give feedback.
You have to delete the PVC. However, the PVC just provides a link between the Pod and the PV. So, deleting the PVC does not mean deleting your PV / your data. If you configure your storage class to retain the PVs when PVCs are deleted, you will keep the PV and the data and you can use them if needed.