fixing bug in chuck cache that can cause panic during shutdown #4398

rsteneteg · 2021-08-03T19:59:39Z

Signed-off-by: Roger Steneteg rsteneteg@ea.com

What this PR does:
Fixes a bug in the chunk storage cache that can cause a panic in ingesters during shutdown

Which issue(s) this PR fixes:
Fixes #4397

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Roger Steneteg <rsteneteg@ea.com>

bboreham

Good catch! Are there any similar issues in other places where we do bounded parallelism?

One question before I approve.

pkg/chunk/cache/memcached.go

bboreham

lgtm

bboreham

Couple of thoughts

bboreham · 2021-08-06T11:21:56Z

pkg/chunk/chunk_store_utils.go

+			select {
+			case <-c.quit:
+				return
+			default:


Can this happen: quit is not closed, so we go to default and block on send.
Then quit is closed and all workers exit.
Now we are hung.
Should the chan send be brought up to the select, so either can proceed each time round the loop?

I was a bit unsure about if we could use both send and receive cases in the same select, but seems to be OK since the channels are read/closed from other goroutines, so I moved it up to a case.

pkg/chunk/chunk_store_utils.go

…on closed channel during stop Signed-off-by: Roger Steneteg <rsteneteg@ea.com>

bboreham

Thanks!

pracucci

LGTM, thanks!

pracucci · 2021-08-26T13:19:46Z

pkg/chunk/cache/memcached.go

@@ -239,11 +257,15 @@ func (c *Memcached) Store(ctx context.Context, keys []string, bufs [][]byte) {

 // Stop does nothing.
 func (c *Memcached) Stop() {
-	if c.inputCh == nil {
+	if c.quit == nil {


[nit] c.quit is set in the new function and I can't see where we ever set it to nil so I'm not sure we need this check.

Not a blocker, we can merge anyway.

If batchsize or parallelism is off then we return the cache before we set the quit/inputCh channels

if cfg.BatchSize == 0 || cfg.Parallelism == 0 { return c }

Oh thanks! Didn't notice it.

bboreham · 2021-10-01T14:57:25Z

I encountered a race warning at #4508, which seems to be a similar case; could you take a look at #4511 please @rsteneteg ?

…xproject#4398) * fixing bug in chuck cache that can cause panic during shutdown Signed-off-by: Roger Steneteg <rsteneteg@ea.com> * adding separate quit channel for stopping chunkfetcher to avoid send on closed channel during stop Signed-off-by: Roger Steneteg <rsteneteg@ea.com> Signed-off-by: Alvin Lin <alvinlin@amazon.com>

pull-request-size bot added the size/M label Aug 3, 2021

fixing bug in chuck cache that can cause panic during shutdown

2e6b9a5

Signed-off-by: Roger Steneteg <rsteneteg@ea.com>

rsteneteg force-pushed the rsteneteg/fix-ingester-panic branch from f546446 to 2e6b9a5 Compare August 3, 2021 21:36

bboreham reviewed Aug 4, 2021

View reviewed changes

pkg/chunk/cache/memcached.go Show resolved Hide resolved

bboreham approved these changes Aug 4, 2021

View reviewed changes

pull-request-size bot added size/L and removed size/M labels Aug 4, 2021

rsteneteg force-pushed the rsteneteg/fix-ingester-panic branch 2 times, most recently from 1a722ab to 46a1213 Compare August 4, 2021 18:56

bboreham reviewed Aug 6, 2021

View reviewed changes

rsteneteg force-pushed the rsteneteg/fix-ingester-panic branch from 46a1213 to 8b96709 Compare August 6, 2021 13:18

rsteneteg marked this pull request as draft August 6, 2021 14:09

adding separate quit channel for stopping chunkfetcher to avoid send …

5a86c1a

…on closed channel during stop Signed-off-by: Roger Steneteg <rsteneteg@ea.com>

rsteneteg force-pushed the rsteneteg/fix-ingester-panic branch from 8b96709 to 5a86c1a Compare August 6, 2021 17:54

rsteneteg marked this pull request as ready for review August 6, 2021 18:28

rsteneteg requested a review from bboreham August 6, 2021 18:28

bboreham approved these changes Aug 11, 2021

View reviewed changes

Merge branch 'master' into rsteneteg/fix-ingester-panic

d330f66

pracucci approved these changes Aug 26, 2021

View reviewed changes

pracucci merged commit 2d4d060 into cortexproject:master Aug 26, 2021

rsteneteg deleted the rsteneteg/fix-ingester-panic branch August 26, 2021 16:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing bug in chuck cache that can cause panic during shutdown #4398

fixing bug in chuck cache that can cause panic during shutdown #4398

rsteneteg commented Aug 3, 2021

bboreham left a comment

bboreham left a comment

bboreham left a comment

bboreham Aug 6, 2021

rsteneteg Aug 6, 2021

bboreham left a comment

pracucci left a comment

pracucci Aug 26, 2021

pracucci Aug 26, 2021

rsteneteg Aug 26, 2021

pracucci Aug 26, 2021

bboreham commented Oct 1, 2021

fixing bug in chuck cache that can cause panic during shutdown #4398

fixing bug in chuck cache that can cause panic during shutdown #4398

Conversation

rsteneteg commented Aug 3, 2021

bboreham left a comment

Choose a reason for hiding this comment

bboreham left a comment

Choose a reason for hiding this comment

bboreham left a comment

Choose a reason for hiding this comment

bboreham Aug 6, 2021

Choose a reason for hiding this comment

rsteneteg Aug 6, 2021

Choose a reason for hiding this comment

bboreham left a comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

pracucci Aug 26, 2021

Choose a reason for hiding this comment

pracucci Aug 26, 2021

Choose a reason for hiding this comment

rsteneteg Aug 26, 2021

Choose a reason for hiding this comment

pracucci Aug 26, 2021

Choose a reason for hiding this comment

bboreham commented Oct 1, 2021