Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smart clone fails with No PVC found (zfs-localpv CSI) #2524

Closed
mattlqx opened this issue Dec 31, 2022 · 5 comments
Closed

Smart clone fails with No PVC found (zfs-localpv CSI) #2524

mattlqx opened this issue Dec 31, 2022 · 5 comments
Labels

Comments

@mattlqx
Copy link

mattlqx commented Dec 31, 2022

What happened:

A DataVolume created with a PVC source (created by the importer pod) and fits all the criteria for smart clone is failing with "No PVC found" event. Using zfs-localpv CSI driver.

  Type    Reason                           Age    From                   Message                                                                                                                                 
  ----    ------                           ----   ----                   -------                                                                                                                                 
  Normal  SnapshotForSmartCloneInProgress  9m13s  datavolume-controller  Creating snapshot for smart-clone is in progress (for pvc kubevirt/win11)                                              
  Normal  SnapshotForSmartCloneInProgress  9m13s  datavolume-controller  No PVC found   

The PVC in question does exist and is in the same namespace as the DV.

The ZFS operator does successfully create a snapshot and shows as Ready.

# kubectl get -A zfssnapshots                   
NAMESPACE   NAME                                            AGE
openebs     snapshot-056e89a3-de8b-4dfa-81ab-058b303d129c   90s

But in a different namespace than where the PVC to be cloned is. Not sure if that's an issue here.

DataVolume looks like:

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  annotations:
    cdi.kubevirt.io/cloneType: snapshot
    cdi.kubevirt.io/storage.clone.token: XXXX
    cdi.kubevirt.io/storage.deleteAfterCompletion: "true"
  creationTimestamp: "2022-12-31T19:26:14Z"
  generation: 2
  labels:
    kubevirt.io/created-by: 5e514c25-f29e-47c4-b700-56958f13733b
    kubevirt.io/vm: vm-win11
  name: win11-bootvolume
  namespace: kubevirt
  ownerReferences:
  - apiVersion: kubevirt.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: VirtualMachine
    name: vm-win11
    uid: 5e514c25-f29e-47c4-b700-56958f13733b
  resourceVersion: "31951013"
  uid: 2922b650-1c07-45e7-8707-20b9f252d21c
spec:
  pvc:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 200Gi
    storageClassName: zfs-kubevirt
  source:
    pvc:
      name: win11
      namespace: kubevirt
status:
  conditions:
  - lastHeartbeatTime: "2022-12-31T19:26:14Z"
    lastTransitionTime: "2022-12-31T19:26:14Z"
    message: No PVC found
    reason: SnapshotForSmartCloneInProgress
    status: Unknown
    type: Bound
  - lastHeartbeatTime: "2022-12-31T19:26:14Z"
    lastTransitionTime: "2022-12-31T19:26:14Z"
    reason: SnapshotForSmartCloneInProgress
    status: "False"
    type: Ready
  - lastHeartbeatTime: "2022-12-31T19:26:14Z"
    lastTransitionTime: "2022-12-31T19:26:14Z"
    status: "False"
    type: Running
  phase: SnapshotForSmartCloneInProgress

What you expected to happen:
Smart clone to be successful when creating a VM with a DataVolumeTemplate that includes a DataVolume source of PVC.

Environment:
CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.55.0
Kubernetes version (use kubectl version): v1.24.7-eks-fb459a0
DV specification: N/A
Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): Ubuntu 20.04
Kernel (e.g. uname -a): 5.15.0-1023-aws
Install tools: N/A
Others: N/A

@akalenyu
Copy link
Collaborator

akalenyu commented Jan 1, 2023

Hi, thanks for opening the issue!

Could you please also attach the volumesnapshot resource yaml in the kubevirt namespace, and the cdi-deployment-* pod logs?

I think "No PVC found" is said in the context of the target PVC which will not get created unless
the volumesnapshot is ready to be restored from.

@mattlqx
Copy link
Author

mattlqx commented Jan 1, 2023

Thanks for the reply, sure!

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  annotations:
    cdi.kubevirt.io/ownedByDataVolume: kubevirt/win11-bootvolume
    k8s.io/SmartCloneRequest: "true"
  creationTimestamp: "2022-12-31T19:26:14Z"
  finalizers:
  - snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
  - snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
  generation: 1
  labels:
    app: containerized-data-importer
    app.kubernetes.io/component: storage
    app.kubernetes.io/managed-by: cdi-controller
    app.kubernetes.io/part-of: hyperconverged-cluster
    app.kubernetes.io/version: 1.9.0
    cdi.kubevirt.io: cdi-smart-clone
  name: win11-bootvolume
  namespace: kubevirt
  ownerReferences:
  - apiVersion: cdi.kubevirt.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: DataVolume
    name: win11-bootvolume
    uid: 2922b650-1c07-45e7-8707-20b9f252d21c
  resourceVersion: "31951055"
  uid: 056e89a3-de8b-4dfa-81ab-058b303d129c
spec:
  source:
    persistentVolumeClaimName: win11
  volumeSnapshotClassName: zfspv-snapclass
status:
  boundVolumeSnapshotContentName: snapcontent-056e89a3-de8b-4dfa-81ab-058b303d129c
  creationTime: "2022-12-31T19:26:14Z"
  readyToUse: true
  restoreSize: "0"

cdi-deployment has some interesting lines... (the other pods are silent)

{"level":"info","ts":1672616087.5180178,"logger":"controller.smartclone-controller","msg":"reconciling smart clone","VolumeSnapshot/PersistentVolumeClaim":"kubevirt/win11-bootvolume"}
{"level":"info","ts":1672616087.5181966,"logger":"controller.smartclone-controller","msg":"Reconciling snapshot","VolumeSnapshot/PersistentVolumeClaim":"kubevirt/win11-bootvolume","snapshot.Name":"win11-bootvolume","snapshot.Namespace":"kubevirt"}
{"level":"debug","ts":1672616087.5184731,"logger":"events","msg":"Normal","object":{"kind":"VolumeSnapshot","namespace":"kubevirt","name":"win11-bootvolume","uid":"056e89a3-de8b-4dfa-81ab-058b303d129c","apiVersion":"snapshot.storage.k8s.io/v1","resourceVersion":"31951055"},"reason":"SmartClonePVCInProgress","message":"Creating PVC for smart-clone is in progress (for pvc kubevirt/win11)"}
{"level":"error","ts":1672616087.5269384,"logger":"controller.smartclone-controller","msg":"error creating pvc from snapshot","VolumeSnapshot/PersistentVolumeClaim":"kubevirt/win11-bootvolume","error":"PersistentVolumeClaim \"win11-bootvolume\" is invalid: spec.resources[storage]: Invalid value: \"0\": must be greater than zero","stacktrace":"kubevirt.io/containerized-data-importer/pkg/controller.(*SmartCloneReconciler).Reconcile\n\tpkg/controller/smart-clone-controller.go:144\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tvendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tvendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tvendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tvendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}
{"level":"error","ts":1672616087.527086,"logger":"controller.smartclone-controller","msg":"Reconciler error","name":"win11-bootvolume","namespace":"kubevirt","error":"PersistentVolumeClaim \"win11-bootvolume\" is invalid: spec.resources[storage]: Invalid value: \"0\": must be greater than zero","stacktrace":"kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tvendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tvendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}

Seems to be an issue with the size of the PVC it's trying to create. Looks like restoreSize should be the capacity of the snapshot?

apiVersion: zfs.openebs.io/v1
kind: ZFSSnapshot
metadata:
  creationTimestamp: "2022-12-31T19:26:14Z"
  finalizers:
  - zfs.openebs.io/finalizer
  generation: 2
  labels:
    kubernetes.io/nodename: ip-XXXX.us-west-2.compute.internal
    openebs.io/persistent-volume: pvc-58a4a6a8-260a-4623-aa54-41ee39151658
  name: snapshot-056e89a3-de8b-4dfa-81ab-058b303d129c
  namespace: openebs
  resourceVersion: "31951034"
  uid: 6663c402-8505-444b-a375-212bb12c90fd
spec:
  capacity: "214748364800"
  compression: "on"
  dedup: "on"
  fsType: zfs
  ownerNodeID: ip-XXXX.us-west-2.compute.internal
  poolName: kubevirt
  recordsize: 256k
  volumeType: DATASET
status:
  state: Ready

@akalenyu
Copy link
Collaborator

akalenyu commented Jan 2, 2023

Yeah apparently this specific CSI driver will not populate the restoreSize, and hence CDI tries to create a PVC of size "0" and gets rejected.

There is an open PR tackling this on the zfs-localpv repo:
openebs/zfs-localpv#419

There might be something we can do on the CDI side if the restore size is 0;
One option is using the target size that was specified in the target DV request
The magic happens around

@aglitke
Copy link
Member

aglitke commented Mar 27, 2023

Since this is an identified bug in a specific provisioner and not in CDI we are closing this issue.

@aglitke aglitke closed this as completed Mar 27, 2023
@akalenyu
Copy link
Collaborator

We will probably play nicer with restoreSize missing/0 following #2679

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants