CSI: volume watcher shutdown fixes #12439

tgross · 2022-04-01T21:08:29Z

The volume watcher design was based on deploymentwatcher and drainer,
but has an important difference: we don't want to maintain a goroutine
for the lifetime of the volume. So we stop the volumewatcher goroutine
for a volume when that volume has no more claims to free. But the
shutdown races with updates on the parent goroutine, and it's possible
to drop updates. Fortunately these updates are picked up on the next
core GC job, but we're most likely to hit this race when we're
replacing an allocation and that's the time we least want to wait.

Wait until the volume has "settled" before stopping this goroutine so
that the race between shutdown and the parent goroutine sending on
<-updateCh is pushed to after the window we most care about quick
freeing of claims.

Fixes a resource leak when volumewatchers are no longer needed. The
volume is nil and can't ever be started again, so the volume's
watcher should be removed from the top-level Watcher.
De-flakes the GC job test: the test throws an error because the
claimed node doesn't exist and is unreachable. This flaked instead of
failed because we didn't correctly wait for the first pass through the
volumewatcher.

Make the GC job wait for the volumewatcher to reach the quiescent
timeout window state before running the GC eval under test, so that
we're sure the GC job's work isn't being picked up by processing one
of the earlier claims. Update the claims used so that we're sure the
GC pass won't hit a node unpublish error.
Adds trace logging to unpublish operations

The volume watcher design was based on deploymentwatcher and drainer, but has an important different: we don't want to maintain a goroutine for the lifetime of the volume. So we stop the volumewatcher goroutine for a volume when that volume has no more claims to free. But the shutdown races with updates on the parent goroutine, and it's possible to drop updates. Fortunately these updates are picked up on the next core GC job, but we're most likely to hit this race when we're replacing an allocation and that's the time we least want to wait. Wait until the volume has "settled" before stopping this goroutine so that the race between shutdown and the parent goroutine sending on `<-updateCh` is pushed to after the window we most care about quick freeing of claims.

The GC job under test throws an error because the claimed node doesn't exist and is unreachable. This flaked instead of failed because we didn't correctly wait for the first pass through the volumewatcher. Make the GC job wait for the volumewatcher to reach the quiescent timeout window state before running the GC eval under test, so that we're sure the GC job's work isn't being picked up by processing one of the earlier claims. Update the claims used so that we're sure the GC pass won't hit a node unpublish error.

shoenig

LGTM

The volume watcher design was based on deploymentwatcher and drainer, but has an important difference: we don't want to maintain a goroutine for the lifetime of the volume. So we stop the volumewatcher goroutine for a volume when that volume has no more claims to free. But the shutdown races with updates on the parent goroutine, and it's possible to drop updates. Fortunately these updates are picked up on the next core GC job, but we're most likely to hit this race when we're replacing an allocation and that's the time we least want to wait. Wait until the volume has "settled" before stopping this goroutine so that the race between shutdown and the parent goroutine sending on `<-updateCh` is pushed to after the window we most care about quick freeing of claims. * Fixes a resource leak when volumewatchers are no longer needed. The volume is nil and can't ever be started again, so the volume's `watcher` should be removed from the top-level `Watcher`. * De-flakes the GC job test: the test throws an error because the claimed node doesn't exist and is unreachable. This flaked instead of failed because we didn't correctly wait for the first pass through the volumewatcher. Make the GC job wait for the volumewatcher to reach the quiescent timeout window state before running the GC eval under test, so that we're sure the GC job's work isn't being picked up by processing one of the earlier claims. Update the claims used so that we're sure the GC pass won't hit a node unpublish error. * Adds trace logging to unpublish operations

github-actions · 2022-10-16T02:47:22Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross added 4 commits April 1, 2022 16:56

CSI: add trace logging to CSI unpublish operations

ec0bb64

CSI: delete volumewatcher when volume is nil

9662260

CSI: update volumes_watcher test for quiescence

bca7d93

tgross added this to the 1.3.0 milestone Apr 1, 2022

tgross force-pushed the b-csi-core-job-gc branch from d99b5bb to beaa779 Compare April 4, 2022 12:29

vercel bot deployed to Preview – nomad-storybook-and-ui April 4, 2022 12:29 View deployment

vercel bot temporarily deployed to Preview – nomad April 4, 2022 12:29 Inactive

tgross marked this pull request as ready for review April 4, 2022 13:32

tgross requested review from shoenig and jrasell April 4, 2022 13:32

shoenig approved these changes Apr 4, 2022

View reviewed changes

tgross merged commit f718c13 into main Apr 4, 2022

tgross deleted the b-csi-core-job-gc branch April 4, 2022 14:46

tgross added backport/1.1 labels Apr 4, 2022

tgross mentioned this pull request Apr 5, 2022

raw_exec: make raw exec driver work with cgroups v2 #12419

Merged

lgfa29 added backport/1.1.x backport to 1.1.x release line backport/1.2.x backport to 1.1.x release line labels Apr 19, 2022

This was referenced Apr 19, 2022

Backport of CSI: volume watcher shutdown fixes into release/1.2.x #12698

Closed

Backport of CSI: volume watcher shutdown fixes into release/1.1.x #12699

Closed

lgfa29 added a commit that referenced this pull request Apr 20, 2022

fix #12439 backport

05d53aa

lgfa29 removed stage/needs-backporting labels Apr 20, 2022

shoenig mentioned this pull request May 25, 2022

manual backport docker test fixes to 1.2.x #13117

Closed

shoenig mentioned this pull request Jun 2, 2022

backport 13205 to 1.2.x #13211

Closed

github-actions bot locked as resolved and limited conversation to collaborators Oct 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSI: volume watcher shutdown fixes #12439

CSI: volume watcher shutdown fixes #12439

tgross commented Apr 1, 2022 •

edited

Loading

shoenig left a comment

github-actions bot commented Oct 16, 2022

CSI: volume watcher shutdown fixes #12439

CSI: volume watcher shutdown fixes #12439

Conversation

tgross commented Apr 1, 2022 • edited Loading

shoenig left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 16, 2022

tgross commented Apr 1, 2022 •

edited

Loading