Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VolumeGroupSnapshot deletion intermediate failures #1035

Closed
Madhu-1 opened this issue Mar 14, 2024 · 14 comments
Closed

VolumeGroupSnapshot deletion intermediate failures #1035

Madhu-1 opened this issue Mar 14, 2024 · 14 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@Madhu-1
Copy link
Contributor

Madhu-1 commented Mar 14, 2024

What happened:

The volumegroupsnapshot deletion is kind of stuck because the volumesnapshotcontent are already deleted

if groupSnapshotContent.Status != nil && len(groupSnapshotContent.Status.VolumeSnapshotContentRefList) != 0 {
for _, contentRef := range groupSnapshotContent.Status.VolumeSnapshotContentRefList {
snapshotContent, err := ctrl.contentLister.Get(contentRef.Name)
if err != nil {
return fmt.Errorf("failed to get snapshot content %s from snapshot content store: %v", contentRef.Name, err)
}
snapshotIDs = append(snapshotIDs, *snapshotContent.Status.SnapshotHandle)
}

What you expected to happen:

The volumegroupsnapshot deletion should happen

How to reproduce it:

It's sometimes happens not always

  • Create VolumeGrousnapshot
  • Delete VolumeGroupSnapshot

Anything else we need to know?:

Environment:

  • Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Logs

I0314 11:31:02.878590       1 connection.go:244] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0314 11:31:02.878604       1 connection.go:245] GRPC request: {}
I0314 11:31:02.881202       1 connection.go:251] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":13}}}]}
I0314 11:31:02.881426       1 connection.go:252] GRPC error: <nil>
I0314 11:31:02.881516       1 snapshot_controller.go:291] checkandUpdateContentStatusOperation: driver rook-ceph.cephfs.csi.ceph.com, snapshotId 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81, creationTime 0001-01-01 00:00:00 +0000 UTC, size 0, readyToUse true, groupSnapshotID 
I0314 11:31:02.881595       1 snapshot_controller.go:436] updateSnapshotContentStatus: updating VolumeSnapshotContent [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2], snapshotHandle 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81, readyToUse true, createdAt 1710415862881587581, size 0, groupSnapshotID 
I0314 11:31:03.061944       1 request.go:629] Waited for 183.142202ms due to client-side throttling, not priority and fairness, request: POST:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/namespaces/default/volumesnapshots
I0314 11:31:03.077873       1 groupsnapshot_helper.go:631] updateSnapshotContentStatus: updating VolumeGroupSnapshotContent [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4], groupSnapshotHandle 0001-0009-rook-ceph-0000000000000001-bddb800e-12ad-4138-914d-6b46974e41e7, readyToUse true, createdAt 1710415862818578349
I0314 11:31:03.262020       1 request.go:629] Waited for 380.321987ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/volumesnapshotcontents/snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:03.461552       1 request.go:629] Waited for 383.527063ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:03.662096       1 request.go:629] Waited for 396.451068ms due to client-side throttling, not priority and fairness, request: PUT:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/volumesnapshotcontents/snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2/status
I0314 11:31:03.673135       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15298
I0314 11:31:03.673577       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:03.673726       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.673739       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15298
I0314 11:31:03.673751       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.673784       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:03.861726       1 request.go:629] Waited for 395.824892ms due to client-side throttling, not priority and fairness, request: PUT:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4/status
I0314 11:31:03.863301       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:03.863408       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.863423       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15299
I0314 11:31:03.863434       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.863458       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:03.879845       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:04.061161       1 request.go:629] Waited for 181.449744ms due to client-side throttling, not priority and fairness, request: PATCH:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:04.071738       1 groupsnapshot_helper.go:617] Removed VolumeGroupSnapshotBeingCreated annotation from volume group snapshot content groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:04.071862       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.071895       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.071950       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.071984       1 util.go:246] storeObjectUpdate: ignoring groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" version 15300
I0314 11:31:04.072117       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:04.072207       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.072249       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.072267       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.072286       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:04.261941       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:04.261994       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:04.262006       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15302
I0314 11:31:04.263501       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:04.263557       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.430929       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.430977       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.431004       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15316
I0314 11:31:14.431014       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.431037       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.449621       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.449677       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.449696       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15317
I0314 11:31:14.449705       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.449752       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.449768       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]: the policy is Delete
I0314 11:31:14.449778       1 groupsnapshot_helper.go:233] deleteCSISnapshotOperation [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] started
I0314 11:31:14.454424       1 connection.go:244] GRPC call: /csi.v1.GroupController/DeleteVolumeGroupSnapshot
I0314 11:31:14.454446       1 connection.go:245] GRPC request: {"group_snapshot_id":"0001-0009-rook-ceph-0000000000000001-bddb800e-12ad-4138-914d-6b46974e41e7","secrets":"***stripped***","snapshot_ids":["0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"]}
I0314 11:31:14.477950       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:14.477994       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.478012       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15319
I0314 11:31:14.478123       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.478180       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.493275       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:14.493330       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.493454       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15320
I0314 11:31:14.493506       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.493519       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.493553       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:14.493600       1 snapshot_controller.go:107] Deleting snapshot for content: snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:14.493610       1 snapshot_controller.go:379] deleteCSISnapshotOperation [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] started
I0314 11:31:14.499437       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0314 11:31:14.499514       1 connection.go:245] GRPC request: {"secrets":"***stripped***","snapshot_id":"0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"}
I0314 11:31:14.500512       1 connection.go:251] GRPC response: {}
I0314 11:31:14.500534       1 connection.go:252] GRPC error: rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists
E0314 11:31:14.500630       1 snapshot_controller_base.go:359] could not sync content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2": failed to delete snapshot "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", err: failed to delete snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2: "rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists"
I0314 11:31:14.500646       1 snapshot_controller_base.go:230] Failed to sync content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", will retry again: failed to delete snapshot "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", err: failed to delete snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2: "rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists"
I0314 11:31:14.500733       1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshotContent", Namespace:"", Name:"snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", UID:"09c23a0d-206c-4256-99bc-500fe70514df", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"15320", FieldPath:""}): type: 'Warning' reason: 'SnapshotDeleteError' Failed to delete snapshot
I0314 11:31:14.644781       1 connection.go:251] GRPC response: {}
I0314 11:31:14.644803       1 connection.go:252] GRPC error: <nil>
I0314 11:31:14.644831       1 groupsnapshot_helper.go:274] clearGroupSnapshotContentStatus content [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.657400       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.658219       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15323
I0314 11:31:14.658273       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.658294       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.658303       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]: the policy is Delete
I0314 11:31:14.667918       1 groupsnapshot_helper.go:223] Removed protection finalizer from volume group snapshot content groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:14.667947       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15323
I0314 11:31:14.667975       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.667989       1 groupsnapshot_helper.go:160] group snapshot content "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" deleted
I0314 11:31:14.668023       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.668064       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.668161       1 groupsnapshot_helper.go:117] deletion of group snapshot content "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" was already processed
I0314 11:31:15.500744       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.500787       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15320
I0314 11:31:15.500800       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.500822       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:15.500830       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:15.500841       1 snapshot_controller.go:107] Deleting snapshot for content: snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:15.500847       1 snapshot_controller.go:379] deleteCSISnapshotOperation [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] started
I0314 11:31:15.505177       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0314 11:31:15.505201       1 connection.go:245] GRPC request: {"secrets":"***stripped***","snapshot_id":"0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"}
I0314 11:31:15.514540       1 connection.go:251] GRPC response: {}
I0314 11:31:15.514573       1 connection.go:252] GRPC error: <nil>
I0314 11:31:15.514589       1 snapshot_controller.go:410] cleanVolumeSnapshotStatus content [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.539718       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15327
I0314 11:31:15.539829       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:15.539874       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.539894       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:15.539909       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:15.558697       1 snapshot_controller.go:615] Removed protection finalizer from volume snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:15.558741       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15327
I0314 11:31:15.558776       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.558780       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:15.558798       1 snapshot_controller_base.go:369] content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" deleted
I0314 11:31:15.558816       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.558836       1 snapshot_controller_base.go:284] deletion of content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" was already processed
I0314 11:31:16.597420       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:31:42.612012       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:32:08.623907       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:32:34.648338       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:00.664440       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:26.673596       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:52.693577       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:02.522756       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1.VolumeSnapshotContent total 15 items received
I0314 11:34:18.706758       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:44.723835       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:55.520666       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1.VolumeSnapshotClass total 9 items received
I0314 11:35:10.733656       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:35:14.522928       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1alpha1.VolumeGroupSnapshotContent total 175 items received
I0314 11:35:36.700702       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]
I0314 11:35:36.700742       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304" with version 14478
I0314 11:35:36.700752       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]
I0314 11:35:36.700772       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304] should be deleted.
I0314 11:35:36.700778       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]: the policy is Delete
I0314 11:35:36.700784       1 groupsnapshot_helper.go:233] deleteCSISnapshotOperation [groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304] started
E0314 11:35:36.704116       1 groupsnapshot_helper.go:149] could not sync group snapshot content "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304": failed to get snapshot content snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17 from snapshot content store: volumesnapshotcontent.snapshot.storage.k8s.io "snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17" not found
I0314 11:35:36.704155       1 groupsnapshot_helper.go:71] Failed to sync group snapshot content "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304", will retry again: failed to get snapshot content snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17 from snapshot content store: volumesnapshotcontent.snapshot.storage.k8s.io "snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17" not found
@Madhu-1
Copy link
Contributor Author

Madhu-1 commented Mar 28, 2024

This is also fixed by #1011 as this updates the volumegroupsnapshotname in the snapshotcontent status

@Madhu-1 Madhu-1 closed this as completed Mar 28, 2024
@Madhu-1
Copy link
Contributor Author

Madhu-1 commented Apr 10, 2024

It looks like not fixed yet, reopening

@Madhu-1 Madhu-1 reopened this Apr 10, 2024
@jedops
Copy link

jedops commented May 15, 2024

Hello, I'm having some difficulty understanding the detail of this issue. Would this issue present itself by failing to delete a snapshot or would it somehow accidentally delete a snapshot. Would somebody be so kind enough to explain?

Thanks!

@Madhu-1
Copy link
Contributor Author

Madhu-1 commented May 17, 2024

Hello, I'm having some difficulty understanding the detail of this issue. Would this issue present itself by failing to delete a snapshot or would it somehow accidentally delete a snapshot. Would somebody be so kind enough to explain?

Thanks!

@jedops the snapshots are deleted internally when the volumegroupsnapshot are deleted, i have provided steps to reproduce and some logs as well, you can see some checks are missing to skip already deleted snapshots or we need reorder the steps on how we delete snapshot which are created as part of volumegroupsnapshot.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 15, 2024
@yati1998
Copy link
Contributor

Just an update,

I tested the creation of deletion of volumegroupsnapshot with cephfs driver and it seems to work fine.
To re-confirm I tried it again

yatipadia:ceph-csi$ kubectl get volumegroupsnapshot
NAME                       READYTOUSE   VOLUMEGROUPSNAPSHOTCLASS          VOLUMEGROUPSNAPSHOTCONTENT                              CREATIONTIME   AGE
new-groupsnapshot-demo-1   true         csi-cephfsplugin-groupsnapclass   groupsnapcontent-b8b1c10d-5c07-47c3-bc36-42d4294628e4   5h47m          5h47m
yatipadia:ceph-csi$ kubectl delete volumegroupsnapshot new-groupsnapshot-demo-1
volumegroupsnapshot.groupsnapshot.storage.k8s.io "new-groupsnapshot-demo-1" deleted

@yati1998
Copy link
Contributor

Just an update, I tried out the same with 10-11 pvcs, the volumegroupsnapshot was successfully deleted.

yatipadia:Documents$ kubectl get volumesnapshotcontent
NAME                                                                                              READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                          VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT                                                                                 VOLUMESNAPSHOTNAMESPACE   AGE
snapcontent-114d4ee02d9142894694e5f0d923333c1c840ec22baccf06bdef58d2d66bc1e1-2024-08-30-5.33.1    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-114d4ee02d9142894694e5f0d923333c1c840ec22baccf06bdef58d2d66bc1e1-2024-08-30-5.33.1    default                   4m58s
snapcontent-1aacbcb8f80a5c913552ca17c2119725081584242e63e3e6a981a5e96ba95e94-2024-08-30-5.33.5    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-1aacbcb8f80a5c913552ca17c2119725081584242e63e3e6a981a5e96ba95e94-2024-08-30-5.33.5    default                   4m53s
snapcontent-4a7d3f27fc5655a210c9dd2228c6d9f6db722446334ce9e0108f6f11640ffd75-2024-08-30-5.33.9    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-4a7d3f27fc5655a210c9dd2228c6d9f6db722446334ce9e0108f6f11640ffd75-2024-08-30-5.33.9    default                   4m50s
snapcontent-55c4157c9dfeb50d176631205bd87afaa7d60059d64ac26096361f009197063b-2024-08-30-5.33.1    true         1073741824    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-55c4157c9dfeb50d176631205bd87afaa7d60059d64ac26096361f009197063b-2024-08-30-5.33.1    default                   4m58s
snapcontent-7d10cafc30b34e5adf253ed8b57da6d0b4718fda80a4e4a048d7e584a31e1e2b-2024-08-30-5.33.2    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-7d10cafc30b34e5adf253ed8b57da6d0b4718fda80a4e4a048d7e584a31e1e2b-2024-08-30-5.33.2    default                   4m57s
snapcontent-817932568eb28072640eb89ecfca12ab4c4c7503589ba729e91dfc9efa1d50b6-2024-08-30-5.33.10   true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-817932568eb28072640eb89ecfca12ab4c4c7503589ba729e91dfc9efa1d50b6-2024-08-30-5.33.10   default                   4m49s
snapcontent-8dd24fbc3ffd96b1f879c2e8a92ed15edc364792186dd2740739cec1d7887365-2024-08-30-5.33.8    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-8dd24fbc3ffd96b1f879c2e8a92ed15edc364792186dd2740739cec1d7887365-2024-08-30-5.33.8    default                   4m51s
snapcontent-96fa767049e8bc9e62e60af70e373323f5c23fb5aa971224894c67ad98c60e09-2024-08-30-5.33.11   true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-96fa767049e8bc9e62e60af70e373323f5c23fb5aa971224894c67ad98c60e09-2024-08-30-5.33.11   default                   4m47s
snapcontent-ac67c44cb87ea6968bd6075cc6a48981692fb38cf462c80794073db84a2a590b-2024-08-30-5.33.7    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-ac67c44cb87ea6968bd6075cc6a48981692fb38cf462c80794073db84a2a590b-2024-08-30-5.33.7    default                   4m52s
snapcontent-d3fc2e9cdf6662234b820ddeabb8b32792ec17223020ca73da808d7487851779-2024-08-30-5.33.4    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-d3fc2e9cdf6662234b820ddeabb8b32792ec17223020ca73da808d7487851779-2024-08-30-5.33.4    default                   4m55s
snapcontent-ef5edb9c4d1603d3a8688d52492cad28c5a6fec4bdb57dffc57f8f6c0e6dfae4-2024-08-30-5.33.3    true         1073741824    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-ef5edb9c4d1603d3a8688d52492cad28c5a6fec4bdb57dffc57f8f6c0e6dfae4-2024-08-30-5.33.3    default                   4m56s
yatipadia:Documents$ 
yatipadia:Documents$ kubectl get volumegroupsnapshot
NAME                       READYTOUSE   VOLUMEGROUPSNAPSHOTCLASS          VOLUMEGROUPSNAPSHOTCONTENT                              CREATIONTIME   AGE
new-groupsnapshot-demo-1   true         csi-cephfsplugin-groupsnapclass   groupsnapcontent-a767e7e9-46df-407e-b282-d0263d13e45e   5m8s           5m10s
yatipadia:Documents$ kubectl delete volumegroupsnapshot new-groupsnapshot-demo-1
volumegroupsnapshot.groupsnapshot.storage.k8s.io "new-groupsnapshot-demo-1" deleted
yatipadia:Documents$ kubectl get volumesnapshotcontent
No resources found
yatipadia:Documents$ kubectl get volumegroupsnapshot
No resources found in default namespace.
yatipadia:Documents$ kubectl get volumesnapshot
No resources found in default namespace.
yatipadia:Documents$ 

cc @Madhu-1

@Madhu-1
Copy link
Contributor Author

Madhu-1 commented Sep 2, 2024

good to hear we dont have this bug anymore, in that case we can close it.

@yati1998
Copy link
Contributor

yati1998 commented Sep 4, 2024

@Madhu can you please close this issue as well.

@Madhu-1 Madhu-1 closed this as completed Sep 4, 2024
@yati1998
Copy link
Contributor

@Madhu-1 can you re-open this issue, we can use the same issue to track the bug

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 24, 2024
@xing-yang
Copy link
Collaborator

xing-yang commented Oct 25, 2024

Can you still reproduce this problem? A snapshot content can't be deleted if the group snapshot handle is in its status. How did you clear the status?
If this problem is still reproducible, we should fix the problem so that this can't happen.

@yati1998
Copy link
Contributor

Can you still reproduce this problem? A snapshot content can't be deleted if the group snapshot handle is in its status. How did you clear the status? If this problem is still reproducible, we should fix the problem so that this can't happen.

we were able to reproduce this is in one of our DR cluster, will share the output with you, currently the cluster is destroyed.

@yati1998
Copy link
Contributor

yati1998 commented Nov 4, 2024

Can you still reproduce this problem? A snapshot content can't be deleted if the group snapshot handle is in its status. How did you clear the status? If this problem is still reproducible, we should fix the problem so that this can't happen.

we were able to reproduce this is in one of our DR cluster, will share the output with you, currently the cluster is destroyed.

After reproducing this error with correct version of snapshotter, we were not able to reproduce this error. So, I think we are good to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants