VolumeGroupSnapshot deletion intermediate failures #1035

Madhu-1 · 2024-03-14T13:58:23Z

What happened:

The volumegroupsnapshot deletion is kind of stuck because the volumesnapshotcontent are already deleted

external-snapshotter/pkg/sidecar-controller/groupsnapshot_helper.go

Lines 242 to 249 in fcf78d3

    
           if groupSnapshotContent.Status != nil && len(groupSnapshotContent.Status.VolumeSnapshotContentRefList) != 0 { 
        
           	for _, contentRef := range groupSnapshotContent.Status.VolumeSnapshotContentRefList { 
        
           		snapshotContent, err := ctrl.contentLister.Get(contentRef.Name) 
        
           		if err != nil { 
        
           			return fmt.Errorf("failed to get snapshot content %s from snapshot content store: %v", contentRef.Name, err) 
        
           		} 
        
           		snapshotIDs = append(snapshotIDs, *snapshotContent.Status.SnapshotHandle) 
        
           	}

What you expected to happen:

The volumegroupsnapshot deletion should happen

How to reproduce it:

It's sometimes happens not always

Create VolumeGrousnapshot
Delete VolumeGroupSnapshot

Anything else we need to know?:

Environment:

Driver version:
Kubernetes version (use kubectl version):
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

Logs

I0314 11:31:02.878590       1 connection.go:244] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0314 11:31:02.878604       1 connection.go:245] GRPC request: {}
I0314 11:31:02.881202       1 connection.go:251] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":13}}}]}
I0314 11:31:02.881426       1 connection.go:252] GRPC error: <nil>
I0314 11:31:02.881516       1 snapshot_controller.go:291] checkandUpdateContentStatusOperation: driver rook-ceph.cephfs.csi.ceph.com, snapshotId 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81, creationTime 0001-01-01 00:00:00 +0000 UTC, size 0, readyToUse true, groupSnapshotID 
I0314 11:31:02.881595       1 snapshot_controller.go:436] updateSnapshotContentStatus: updating VolumeSnapshotContent [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2], snapshotHandle 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81, readyToUse true, createdAt 1710415862881587581, size 0, groupSnapshotID 
I0314 11:31:03.061944       1 request.go:629] Waited for 183.142202ms due to client-side throttling, not priority and fairness, request: POST:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/namespaces/default/volumesnapshots
I0314 11:31:03.077873       1 groupsnapshot_helper.go:631] updateSnapshotContentStatus: updating VolumeGroupSnapshotContent [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4], groupSnapshotHandle 0001-0009-rook-ceph-0000000000000001-bddb800e-12ad-4138-914d-6b46974e41e7, readyToUse true, createdAt 1710415862818578349
I0314 11:31:03.262020       1 request.go:629] Waited for 380.321987ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/volumesnapshotcontents/snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:03.461552       1 request.go:629] Waited for 383.527063ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:03.662096       1 request.go:629] Waited for 396.451068ms due to client-side throttling, not priority and fairness, request: PUT:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/volumesnapshotcontents/snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2/status
I0314 11:31:03.673135       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15298
I0314 11:31:03.673577       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:03.673726       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.673739       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15298
I0314 11:31:03.673751       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.673784       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:03.861726       1 request.go:629] Waited for 395.824892ms due to client-side throttling, not priority and fairness, request: PUT:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4/status
I0314 11:31:03.863301       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:03.863408       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.863423       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15299
I0314 11:31:03.863434       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.863458       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:03.879845       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:04.061161       1 request.go:629] Waited for 181.449744ms due to client-side throttling, not priority and fairness, request: PATCH:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:04.071738       1 groupsnapshot_helper.go:617] Removed VolumeGroupSnapshotBeingCreated annotation from volume group snapshot content groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:04.071862       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.071895       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.071950       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.071984       1 util.go:246] storeObjectUpdate: ignoring groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" version 15300
I0314 11:31:04.072117       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:04.072207       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.072249       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.072267       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.072286       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:04.261941       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:04.261994       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:04.262006       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15302
I0314 11:31:04.263501       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:04.263557       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.430929       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.430977       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.431004       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15316
I0314 11:31:14.431014       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.431037       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.449621       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.449677       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.449696       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15317
I0314 11:31:14.449705       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.449752       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.449768       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]: the policy is Delete
I0314 11:31:14.449778       1 groupsnapshot_helper.go:233] deleteCSISnapshotOperation [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] started
I0314 11:31:14.454424       1 connection.go:244] GRPC call: /csi.v1.GroupController/DeleteVolumeGroupSnapshot
I0314 11:31:14.454446       1 connection.go:245] GRPC request: {"group_snapshot_id":"0001-0009-rook-ceph-0000000000000001-bddb800e-12ad-4138-914d-6b46974e41e7","secrets":"***stripped***","snapshot_ids":["0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"]}
I0314 11:31:14.477950       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:14.477994       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.478012       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15319
I0314 11:31:14.478123       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.478180       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.493275       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:14.493330       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.493454       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15320
I0314 11:31:14.493506       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.493519       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.493553       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:14.493600       1 snapshot_controller.go:107] Deleting snapshot for content: snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:14.493610       1 snapshot_controller.go:379] deleteCSISnapshotOperation [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] started
I0314 11:31:14.499437       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0314 11:31:14.499514       1 connection.go:245] GRPC request: {"secrets":"***stripped***","snapshot_id":"0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"}
I0314 11:31:14.500512       1 connection.go:251] GRPC response: {}
I0314 11:31:14.500534       1 connection.go:252] GRPC error: rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists
E0314 11:31:14.500630       1 snapshot_controller_base.go:359] could not sync content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2": failed to delete snapshot "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", err: failed to delete snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2: "rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists"
I0314 11:31:14.500646       1 snapshot_controller_base.go:230] Failed to sync content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", will retry again: failed to delete snapshot "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", err: failed to delete snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2: "rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists"
I0314 11:31:14.500733       1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshotContent", Namespace:"", Name:"snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", UID:"09c23a0d-206c-4256-99bc-500fe70514df", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"15320", FieldPath:""}): type: 'Warning' reason: 'SnapshotDeleteError' Failed to delete snapshot
I0314 11:31:14.644781       1 connection.go:251] GRPC response: {}
I0314 11:31:14.644803       1 connection.go:252] GRPC error: <nil>
I0314 11:31:14.644831       1 groupsnapshot_helper.go:274] clearGroupSnapshotContentStatus content [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.657400       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.658219       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15323
I0314 11:31:14.658273       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.658294       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.658303       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]: the policy is Delete
I0314 11:31:14.667918       1 groupsnapshot_helper.go:223] Removed protection finalizer from volume group snapshot content groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:14.667947       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15323
I0314 11:31:14.667975       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.667989       1 groupsnapshot_helper.go:160] group snapshot content "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" deleted
I0314 11:31:14.668023       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.668064       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.668161       1 groupsnapshot_helper.go:117] deletion of group snapshot content "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" was already processed
I0314 11:31:15.500744       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.500787       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15320
I0314 11:31:15.500800       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.500822       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:15.500830       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:15.500841       1 snapshot_controller.go:107] Deleting snapshot for content: snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:15.500847       1 snapshot_controller.go:379] deleteCSISnapshotOperation [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] started
I0314 11:31:15.505177       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0314 11:31:15.505201       1 connection.go:245] GRPC request: {"secrets":"***stripped***","snapshot_id":"0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"}
I0314 11:31:15.514540       1 connection.go:251] GRPC response: {}
I0314 11:31:15.514573       1 connection.go:252] GRPC error: <nil>
I0314 11:31:15.514589       1 snapshot_controller.go:410] cleanVolumeSnapshotStatus content [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.539718       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15327
I0314 11:31:15.539829       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:15.539874       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.539894       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:15.539909       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:15.558697       1 snapshot_controller.go:615] Removed protection finalizer from volume snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:15.558741       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15327
I0314 11:31:15.558776       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.558780       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:15.558798       1 snapshot_controller_base.go:369] content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" deleted
I0314 11:31:15.558816       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.558836       1 snapshot_controller_base.go:284] deletion of content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" was already processed
I0314 11:31:16.597420       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:31:42.612012       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:32:08.623907       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:32:34.648338       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:00.664440       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:26.673596       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:52.693577       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:02.522756       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1.VolumeSnapshotContent total 15 items received
I0314 11:34:18.706758       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:44.723835       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:55.520666       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1.VolumeSnapshotClass total 9 items received
I0314 11:35:10.733656       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:35:14.522928       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1alpha1.VolumeGroupSnapshotContent total 175 items received
I0314 11:35:36.700702       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]
I0314 11:35:36.700742       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304" with version 14478
I0314 11:35:36.700752       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]
I0314 11:35:36.700772       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304] should be deleted.
I0314 11:35:36.700778       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]: the policy is Delete
I0314 11:35:36.700784       1 groupsnapshot_helper.go:233] deleteCSISnapshotOperation [groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304] started
E0314 11:35:36.704116       1 groupsnapshot_helper.go:149] could not sync group snapshot content "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304": failed to get snapshot content snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17 from snapshot content store: volumesnapshotcontent.snapshot.storage.k8s.io "snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17" not found
I0314 11:35:36.704155       1 groupsnapshot_helper.go:71] Failed to sync group snapshot content "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304", will retry again: failed to get snapshot content snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17 from snapshot content store: volumesnapshotcontent.snapshot.storage.k8s.io "snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17" not found

The text was updated successfully, but these errors were encountered:

Madhu-1 · 2024-03-28T14:03:57Z

This is also fixed by #1011 as this updates the volumegroupsnapshotname in the snapshotcontent status

Madhu-1 · 2024-04-10T14:11:02Z

It looks like not fixed yet, reopening

jedops · 2024-05-15T20:00:49Z

Hello, I'm having some difficulty understanding the detail of this issue. Would this issue present itself by failing to delete a snapshot or would it somehow accidentally delete a snapshot. Would somebody be so kind enough to explain?

Thanks!

Madhu-1 · 2024-05-17T07:49:34Z

Hello, I'm having some difficulty understanding the detail of this issue. Would this issue present itself by failing to delete a snapshot or would it somehow accidentally delete a snapshot. Would somebody be so kind enough to explain?

Thanks!

@jedops the snapshots are deleted internally when the volumegroupsnapshot are deleted, i have provided steps to reproduce and some logs as well, you can see some checks are missing to skip already deleted snapshots or we need reorder the steps on how we delete snapshot which are created as part of volumegroupsnapshot.

k8s-triage-robot · 2024-08-15T08:15:54Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

yati1998 · 2024-08-29T17:13:19Z

Just an update,

I tested the creation of deletion of volumegroupsnapshot with cephfs driver and it seems to work fine.
To re-confirm I tried it again

yatipadia:ceph-csi$ kubectl get volumegroupsnapshot
NAME                       READYTOUSE   VOLUMEGROUPSNAPSHOTCLASS          VOLUMEGROUPSNAPSHOTCONTENT                              CREATIONTIME   AGE
new-groupsnapshot-demo-1   true         csi-cephfsplugin-groupsnapclass   groupsnapcontent-b8b1c10d-5c07-47c3-bc36-42d4294628e4   5h47m          5h47m
yatipadia:ceph-csi$ kubectl delete volumegroupsnapshot new-groupsnapshot-demo-1
volumegroupsnapshot.groupsnapshot.storage.k8s.io "new-groupsnapshot-demo-1" deleted

yati1998 · 2024-08-30T17:41:44Z

Just an update, I tried out the same with 10-11 pvcs, the volumegroupsnapshot was successfully deleted.

yatipadia:Documents$ kubectl get volumesnapshotcontent
NAME                                                                                              READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                          VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT                                                                                 VOLUMESNAPSHOTNAMESPACE   AGE
snapcontent-114d4ee02d9142894694e5f0d923333c1c840ec22baccf06bdef58d2d66bc1e1-2024-08-30-5.33.1    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-114d4ee02d9142894694e5f0d923333c1c840ec22baccf06bdef58d2d66bc1e1-2024-08-30-5.33.1    default                   4m58s
snapcontent-1aacbcb8f80a5c913552ca17c2119725081584242e63e3e6a981a5e96ba95e94-2024-08-30-5.33.5    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-1aacbcb8f80a5c913552ca17c2119725081584242e63e3e6a981a5e96ba95e94-2024-08-30-5.33.5    default                   4m53s
snapcontent-4a7d3f27fc5655a210c9dd2228c6d9f6db722446334ce9e0108f6f11640ffd75-2024-08-30-5.33.9    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-4a7d3f27fc5655a210c9dd2228c6d9f6db722446334ce9e0108f6f11640ffd75-2024-08-30-5.33.9    default                   4m50s
snapcontent-55c4157c9dfeb50d176631205bd87afaa7d60059d64ac26096361f009197063b-2024-08-30-5.33.1    true         1073741824    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-55c4157c9dfeb50d176631205bd87afaa7d60059d64ac26096361f009197063b-2024-08-30-5.33.1    default                   4m58s
snapcontent-7d10cafc30b34e5adf253ed8b57da6d0b4718fda80a4e4a048d7e584a31e1e2b-2024-08-30-5.33.2    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-7d10cafc30b34e5adf253ed8b57da6d0b4718fda80a4e4a048d7e584a31e1e2b-2024-08-30-5.33.2    default                   4m57s
snapcontent-817932568eb28072640eb89ecfca12ab4c4c7503589ba729e91dfc9efa1d50b6-2024-08-30-5.33.10   true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-817932568eb28072640eb89ecfca12ab4c4c7503589ba729e91dfc9efa1d50b6-2024-08-30-5.33.10   default                   4m49s
snapcontent-8dd24fbc3ffd96b1f879c2e8a92ed15edc364792186dd2740739cec1d7887365-2024-08-30-5.33.8    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-8dd24fbc3ffd96b1f879c2e8a92ed15edc364792186dd2740739cec1d7887365-2024-08-30-5.33.8    default                   4m51s
snapcontent-96fa767049e8bc9e62e60af70e373323f5c23fb5aa971224894c67ad98c60e09-2024-08-30-5.33.11   true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-96fa767049e8bc9e62e60af70e373323f5c23fb5aa971224894c67ad98c60e09-2024-08-30-5.33.11   default                   4m47s
snapcontent-ac67c44cb87ea6968bd6075cc6a48981692fb38cf462c80794073db84a2a590b-2024-08-30-5.33.7    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-ac67c44cb87ea6968bd6075cc6a48981692fb38cf462c80794073db84a2a590b-2024-08-30-5.33.7    default                   4m52s
snapcontent-d3fc2e9cdf6662234b820ddeabb8b32792ec17223020ca73da808d7487851779-2024-08-30-5.33.4    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-d3fc2e9cdf6662234b820ddeabb8b32792ec17223020ca73da808d7487851779-2024-08-30-5.33.4    default                   4m55s
snapcontent-ef5edb9c4d1603d3a8688d52492cad28c5a6fec4bdb57dffc57f8f6c0e6dfae4-2024-08-30-5.33.3    true         1073741824    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-ef5edb9c4d1603d3a8688d52492cad28c5a6fec4bdb57dffc57f8f6c0e6dfae4-2024-08-30-5.33.3    default                   4m56s
yatipadia:Documents$ 
yatipadia:Documents$ kubectl get volumegroupsnapshot
NAME                       READYTOUSE   VOLUMEGROUPSNAPSHOTCLASS          VOLUMEGROUPSNAPSHOTCONTENT                              CREATIONTIME   AGE
new-groupsnapshot-demo-1   true         csi-cephfsplugin-groupsnapclass   groupsnapcontent-a767e7e9-46df-407e-b282-d0263d13e45e   5m8s           5m10s
yatipadia:Documents$ kubectl delete volumegroupsnapshot new-groupsnapshot-demo-1
volumegroupsnapshot.groupsnapshot.storage.k8s.io "new-groupsnapshot-demo-1" deleted
yatipadia:Documents$ kubectl get volumesnapshotcontent
No resources found
yatipadia:Documents$ kubectl get volumegroupsnapshot
No resources found in default namespace.
yatipadia:Documents$ kubectl get volumesnapshot
No resources found in default namespace.
yatipadia:Documents$

cc @Madhu-1

Madhu-1 · 2024-09-02T10:06:45Z

good to hear we dont have this bug anymore, in that case we can close it.

yati1998 · 2024-09-04T06:46:36Z

@Madhu can you please close this issue as well.

yati1998 · 2024-09-24T11:54:23Z

@Madhu-1 can you re-open this issue, we can use the same issue to track the bug

k8s-triage-robot · 2024-10-24T12:03:48Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

xing-yang · 2024-10-25T02:25:58Z

Can you still reproduce this problem? A snapshot content can't be deleted if the group snapshot handle is in its status. How did you clear the status?
If this problem is still reproducible, we should fix the problem so that this can't happen.

yati1998 · 2024-10-28T04:35:23Z

Can you still reproduce this problem? A snapshot content can't be deleted if the group snapshot handle is in its status. How did you clear the status? If this problem is still reproducible, we should fix the problem so that this can't happen.

we were able to reproduce this is in one of our DR cluster, will share the output with you, currently the cluster is destroyed.

yati1998 · 2024-11-04T04:07:55Z

Can you still reproduce this problem? A snapshot content can't be deleted if the group snapshot handle is in its status. How did you clear the status? If this problem is still reproducible, we should fix the problem so that this can't happen.

we were able to reproduce this is in one of our DR cluster, will share the output with you, currently the cluster is destroyed.

After reproducing this error with correct version of snapshotter, we were not able to reproduce this error. So, I think we are good to close this issue.

Madhu-1 closed this as completed Mar 28, 2024

Madhu-1 reopened this Apr 10, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 15, 2024

Madhu-1 closed this as completed Sep 4, 2024

Madhu-1 reopened this Sep 24, 2024

yati1998 mentioned this issue Oct 24, 2024

add check continue VGS deletion #1164

Closed

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 24, 2024

xing-yang closed this as completed Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VolumeGroupSnapshot deletion intermediate failures #1035

VolumeGroupSnapshot deletion intermediate failures #1035

Madhu-1 commented Mar 14, 2024

Madhu-1 commented Mar 28, 2024

Madhu-1 commented Apr 10, 2024

jedops commented May 15, 2024

Madhu-1 commented May 17, 2024

k8s-triage-robot commented Aug 15, 2024

yati1998 commented Aug 29, 2024

yati1998 commented Aug 30, 2024

Madhu-1 commented Sep 2, 2024

yati1998 commented Sep 4, 2024

yati1998 commented Sep 24, 2024

k8s-triage-robot commented Oct 24, 2024

xing-yang commented Oct 25, 2024 •

edited

Loading

yati1998 commented Oct 28, 2024

yati1998 commented Nov 4, 2024

VolumeGroupSnapshot deletion intermediate failures #1035

VolumeGroupSnapshot deletion intermediate failures #1035

Comments

Madhu-1 commented Mar 14, 2024

Logs

Madhu-1 commented Mar 28, 2024

Madhu-1 commented Apr 10, 2024

jedops commented May 15, 2024

Madhu-1 commented May 17, 2024

k8s-triage-robot commented Aug 15, 2024

yati1998 commented Aug 29, 2024

yati1998 commented Aug 30, 2024

Madhu-1 commented Sep 2, 2024

yati1998 commented Sep 4, 2024

yati1998 commented Sep 24, 2024

k8s-triage-robot commented Oct 24, 2024

xing-yang commented Oct 25, 2024 • edited Loading

yati1998 commented Oct 28, 2024

yati1998 commented Nov 4, 2024

xing-yang commented Oct 25, 2024 •

edited

Loading