Having secondary storage class reports 'did not have enough free storage'. #489

DeprecatedLuke · 2023-11-16T06:26:48Z

What steps did you take and what happened:
Creating a secondary storage class 'large' seems to be very broken. It clearly shows that the capacity is available, but the scheduler is reporting 'did not have enough free storage'. This works just fine when using 'standard' storage class.

Failing line:

  Warning  FailedScheduling  27m                 default-scheduler  0/5 nodes are available: 1 node(s) did not have enough free storage, 1 node(s) had untolerated taint {node.kubernetes.io/unreachable: }, 3 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..

What did you expect to happen:
Pod to schedule normally.

The output of the following commands will help us better understand what's going on:

I1116 06:17:48.581747       1 grpc.go:72] GRPC call: /csi.v1.Controller/GetCapacity requests {"accessible_topology":{"segments":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"s8","kubernetes.io/os":"linux","openebs.io/nodeid":"s8","openebs.io/nodename":"s8"}},"parameters":{"compression":"lz4","dedup":"off","fstype":"zfs","poolname":"k8s-pvs-sd1","recordsize":"128k","shared":"yes","thinprovision":"yes"},"volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{}}]}
I1116 06:17:48.581794       1 grpc.go:81] GRPC response: {"available_capacity":7541961883648}

scale:0} d:{Dec:<nil>} s:1334457180Ki Format:BinarySI}} {Name:k8s-pvs-sd1 UUID:1404655791832034090 Free:{i:{value:7541961883648 scale:0} d:{Dec:<nil>} s: Format:BinarySI}}], required=[{Name:k8s-pvs UUID:3563441934539523479 Free:{i:{value:1366482853888 scale:0} d:{Dec:<nil>} s: Format:BinarySI}} {Name:k8s-pvs-sd1 UUID:1404655791832034090 Free:{i:{value:7541961883648 scale:0} d:{Dec:<nil>} s: Format:BinarySI}}]
I1116 06:21:44.988151       1 zfsnode.go:110] zfs node controller: updating node object with &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:s8 GenerateName: Namespace:openebs SelfLink: UID:bbbd5b66-052f-4eb0-9a1f-596a21ad3a06 ResourceVersion:74568961 Generation:18253 CreationTimestamp:2023-11-03 14:17:30 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[{APIVersion:v1 Kind:Node Name:s8 UID:68e775a0-ba1a-4766-954b-1c215656b119 Controller:0xc000220198 BlockOwnerDeletion:<nil>}] Finalizers:[] ManagedFields:[{Manager:zfs-driver Operation:Update APIVersion:zfs.openebs.io/v1 Time:2023-11-16 06:20:44 +0000 UTC FieldsType:FieldsV1 FieldsV1:{"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"68e775a0-ba1a-4766-954b-1c215656b119\"}":{}}},"f:pools":{}} Subresource:}]} Pools:[{Name:k8s-pvs UUID:3563441934539523479 Free:{i:{value:1366482853888 scale:0} d:{Dec:<nil>} s: Format:BinarySI}} {Name:k8s-pvs-sd1 UUID:1404655791832034090 Free:{i:{value:7541961883648 scale:0} d:{Dec:<nil>} s: Format:BinarySI}}]}
I1116 06:21:45.012038       1 zfsnode.go:114] zfs node controller: updated node object openebs/s8
I1116 06:21:45.013236       1 zfsnode.go:139] Got update event for zfs node openebs/s8

Anything else you would like to add:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
  annotations:
    "storageclass.kubernetes.io/is-default-class": "true"
parameters:
  recordsize: "4k"
  compression: "lz4"
  dedup: "off"
  fstype: "zfs"
  thinprovision: "yes"
  poolname: "k8s-pvs"
  shared: "yes"
provisioner: zfs.csi.openebs.io
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: large
parameters:
  recordsize: "128k"
  compression: "lz4"
  dedup: "off"
  fstype: "zfs"
  thinprovision: "yes"
  poolname: "k8s-pvs-sd1"
  shared: "yes"
provisioner: zfs.csi.openebs.io
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain

Environment:

2.3.1 helm chart
1.28.2
kubeadm
Debian 12

NAME          SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
k8s-pvs      1.49T   208G  1.29T        -         -    11%    13%  1.00x    ONLINE  -
k8s-pvs-sd1  6.98T   672K  6.98T        -         -     0%     0%  1.00x    ONLINE  -

The text was updated successfully, but these errors were encountered:

DeprecatedLuke · 2023-11-17T15:06:35Z

Confirmed that this issue only occurs when using a custom nodeid value.

hrudaya21 · 2023-11-22T04:40:31Z

Confirmed that this issue only occurs when using a custom nodeid value.

@DeprecatedLuke Can you please share the yaml file where you are using custom nodeid ?

DeprecatedLuke · 2023-11-23T01:48:51Z

apiVersion: v1
kind: Node
metadata:
  annotations:
    csi.volume.kubernetes.io/nodeid: '{"csi.tigera.io":"s11","zfs.csi.openebs.io":"s11"}'
    kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/cri-dockerd.sock
    node.alpha.kubernetes.io/ttl: "0"
    projectcalico.org/IPv4Address: 10.1.1.3/24
    projectcalico.org/IPv4VXLANTunnelAddr: 192.168.219.192
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: "2023-11-02T09:34:24Z"
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: s11
    kubernetes.io/os: linux
    ngxdev.com/virtualization-supported: "true"
    #
    # used to be s8, migrated to s11 via manifest magic since nodeid is broken.
    #
    openebs.io/nodeid: s11
    openebs.io/nodename: s11
    #
    #
  name: s11
  resourceVersion: "76744448"
  uid: cfb127c5-863b-4d98-bfa0-3124e4f87f0b
spec:
  podCIDR: 192.168.3.0/24
  podCIDRs:
  - 192.168.3.0/24
status:
  addresses:
  - address: 10.1.1.3
    type: InternalIP
  - address: s11
    type: Hostname
  allocatable:
    cpu: "32"
    ephemeral-storage: "242364885410"
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 131475896Ki
    pods: "110"
  capacity:
    cpu: "32"
    ephemeral-storage: 262982732Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 131578296Ki
    pods: "110"
  conditions:
  - lastHeartbeatTime: "2023-11-16T01:21:43Z"
    lastTransitionTime: "2023-11-16T01:21:43Z"
    message: Calico is running on this node
    reason: CalicoIsUp
    status: "False"
    type: NetworkUnavailable
  - lastHeartbeatTime: "2023-11-23T01:43:04Z"
    lastTransitionTime: "2023-11-16T01:21:36Z"
    message: kubelet has sufficient memory available
    reason: KubeletHasSufficientMemory
    status: "False"
    type: MemoryPressure
  - lastHeartbeatTime: "2023-11-23T01:43:04Z"
    lastTransitionTime: "2023-11-16T01:21:36Z"
    message: kubelet has no disk pressure
    reason: KubeletHasNoDiskPressure
    status: "False"
    type: DiskPressure
  - lastHeartbeatTime: "2023-11-23T01:43:04Z"
    lastTransitionTime: "2023-11-16T01:21:36Z"
    message: kubelet has sufficient PID available
    reason: KubeletHasSufficientPID
    status: "False"
    type: PIDPressure
  - lastHeartbeatTime: "2023-11-23T01:43:04Z"
    lastTransitionTime: "2023-11-16T01:21:36Z"
    message: kubelet is posting ready status. AppArmor enabled
    reason: KubeletReady
    status: "True"
    type: Ready
  daemonEndpoints:
    kubeletEndpoint:
      Port: 10250
  nodeInfo:
    architecture: amd64
    bootID: cce44334-2a88-456a-839c-7e8aec3bcd41
    containerRuntimeVersion: docker://24.0.7
    kernelVersion: 6.1.0-13-amd64
    kubeProxyVersion: v1.28.2
    kubeletVersion: v1.28.2
    machineID: ebb07cd77321441c8ac57ec146518fd0
    operatingSystem: linux
    osImage: Debian GNU/Linux 12 (bookworm)
    systemUUID: d7f6388e-5ec1-11ee-8742-aec0b28c3900

sinhaashish · 2023-12-20T05:09:36Z

Can you try with the latest master now as PR #451 is merged.

DeprecatedLuke · 2023-12-20T06:11:08Z

As I mentioned before I don't have this configuration setup anymore as I migrated the pv nodeid's by modifying ZFSVolume and recreating pvc's. But I can assume this is fixed since it was most likely mapping to a node which was no longer online.

DeprecatedLuke mentioned this issue Nov 17, 2023

fix(plugin): Fix ability to have custom value for openebs.io/nodeid #451

Merged

7 tasks

DeprecatedLuke closed this as completed Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Having secondary storage class reports 'did not have enough free storage'. #489

Having secondary storage class reports 'did not have enough free storage'. #489

DeprecatedLuke commented Nov 16, 2023

DeprecatedLuke commented Nov 17, 2023

hrudaya21 commented Nov 22, 2023

DeprecatedLuke commented Nov 23, 2023

sinhaashish commented Dec 20, 2023

DeprecatedLuke commented Dec 20, 2023

Having secondary storage class reports 'did not have enough free storage'. #489

Having secondary storage class reports 'did not have enough free storage'. #489

Comments

DeprecatedLuke commented Nov 16, 2023

DeprecatedLuke commented Nov 17, 2023

hrudaya21 commented Nov 22, 2023

DeprecatedLuke commented Nov 23, 2023

sinhaashish commented Dec 20, 2023

DeprecatedLuke commented Dec 20, 2023