You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During node decommission operator detected that a pod in the statefulset doesn't have enough space to absorb data from decommission node.
Operator log:
2024-04-22T12:59:54.887Z ERROR Reconciler error {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"data","namespace":"NAMESPACE"}, "namespace": "NAMESPACE", "name": "data", "reconcileID": "25b7c6a7-a9dc-4fb3-9e7f-64a1c858c3ab", "error": "datacenter data is not in a valid state: Not enough free space available to decommission. k8ssandra-data-default-sts-5 has 1414103935512 free space, but 2202792832087 is needed."}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:326
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
Cluster status:
Status:
Cassandra Operator Progress: Updating
Conditions:
Last Transition Time: 2023-11-21T07:18:34Z
Message:
Reason:
Status: True
Type: Healthy
Last Transition Time: 2023-08-10T13:19:30Z
Message:
Reason:
Status: False
Type: Stopped
Last Transition Time: 2023-08-10T13:19:30Z
Message:
Reason:
Status: False
Type: ReplacingNodes
Last Transition Time: 2023-08-10T13:19:30Z
Message:
Reason:
Status: False
Type: Updating
Last Transition Time: 2023-08-10T13:19:30Z
Message:
Reason:
Status: False
Type: RollingRestart
Last Transition Time: 2023-08-10T13:19:30Z
Message:
Reason:
Status: False
Type: Resuming
Last Transition Time: 2024-03-28T13:42:51Z
Message:
Reason:
Status: True
Type: ScalingDown
Last Transition Time: 2024-03-28T13:42:52Z
Message: Not enough free space available to decommission. k8ssandra-data-default-sts-5 has 1414103935512 free space, but 2202792832087 is needed.
Reason: notEnoughSpaceToScaleDown
Status: False
Type: Valid
Last Transition Time: 2023-08-10T13:19:30Z
Message:
Reason:
Status: True
Type: Initialized
Last Transition Time: 2023-08-10T13:19:30Z
Message:
Reason:
Status: True
Type: Ready
We've increased PVC for all pods in the statefulset but operator doesn't revalidate cluster:
kubectl -n NAMESPACE exec k8ssandra-data-default-sts-5 -c cassandra -- df -B1 /var/lib/cassandra
Filesystem 1B-blocks Used Available Use% Mounted on
/dev/nvme1n1 4755807707136 1840917585920 2914873344000 39% /var/lib/cassandra
What happened?
During node decommission operator detected that a pod in the statefulset doesn't have enough space to absorb data from decommission node.
Operator log:
Cluster status:
We've increased PVC for all pods in the statefulset but operator doesn't revalidate cluster:
Currently this cluster is kinda locked as we can't either add or remove a node to the cluster. Looks like operator always stops on this step https://github.com/k8ssandra/cass-operator/blob/v1.14.0/pkg/reconciliation/reconcile_racks.go#L2293 and doesn't proceed to step of updating cluster status.
What did you expect to happen?
Operator revalidates status of the cluster and decommission a node.
How can we reproduce it (as minimally and precisely as possible)?
/var/lib/cassandra
, e.g.fallocate -l SIZEG file_name
size
in correspondingCassandraDatacenter
objectcass-operator version
v1.14.0
Kubernetes version
Server Version: v1.23.2
Method of installation
Argo
Anything else we need to know?
No response
The text was updated successfully, but these errors were encountered: