Refactor the `PusherConsumer` interactions #9133

gotjosh · 2024-08-29T15:34:49Z

What this PR does

Introduce a new BatchingQueue to reveal the intentions of a queue per shard.
Removed the need to call Close as we pushed data into TSDB from the main loop. This was really confusing given we were using close as a semantic to "no more items are coming" and ensure any incomplete batches were done.
Renamed and moved the noopPusherCloser which is more of an alternative way to push data without any sort of concurrency.
Inlined a few methods to get rid of certain level of indirection that made the code harder to understand.

Which issue(s) this PR fixes or relates to

N/A

Checklist

Tests updated.
Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
about-versioning.md updated with experimental features.

gotjosh · 2024-08-29T15:35:48Z

pkg/storage/ingest/pusher.go

@@ -24,21 +24,28 @@ import (
 "github.com/grafana/mimir/pkg/util/spanlogger"
 )

+const shardForSeriesBuffer = 2000 // TODO dimitarvdimitrov 2000 is arbitrary; the idea is that we don't block the goroutine calling PushToStorage while we're flushing. A linked list with a sync.Cond or something different would also work
+
 type Pusher interface {
 PushToStorage(context.Context, *mimirpb.WriteRequest) error
 }

 type PusherCloser interface {


I also wanted to refactor this but turned out to be too big of a change and decided against it.

pkg/storage/ingest/pusher.go

dimitarvdimitrov

nice one! I think there's a place where we can have a deadlock and a place where checking the channel every time is too expensive. Otherwise, this is definitely an improvement.

You mentioned some race condition, but I couldn't spot any. Isn't the test failure caused by different series sharding on different CPU architectures?

pkg/storage/ingest/pusher.go

dimitarvdimitrov

i'll take another look of the whole PR in about 30m, going into a meeting now

pkg/storage/ingest/pusher.go

dimitarvdimitrov

ok, i've reviewed the whole PR. I think there's another place where we might lose data

pkg/storage/ingest/pusher.go

pkg/storage/ingest/pusher_test.go

- Introduce a new `BatchingQueue` to reveal the intentions of a queue per shard. - Removed the need to call `Close` as we pushed data into TSDB from the main loop. This was really confusing given we were using close as a semantic to "no more items are coming" and ensure any incomplete batches were done. - Renamed and moved the `noopPusherCloser` which is more of an alternative way to push data without any sort of concurrency. - Inlined a few methods to get rid of certain level of indirection that made the code harder to understand.

gotjosh commented Aug 29, 2024

View reviewed changes

pkg/storage/ingest/pusher.go Show resolved Hide resolved

gotjosh mentioned this pull request Sep 2, 2024

kafka replay speed: handle metadata refreshes and OffsetOutOfRange errors #9152

Merged

4 tasks

dimitarvdimitrov mentioned this pull request Sep 11, 2024

kafka replay speed: instrument concurrent fetching pipeline #9269

Merged

4 tasks

dimitarvdimitrov reviewed Sep 12, 2024

View reviewed changes

dimitarvdimitrov mentioned this pull request Sep 12, 2024

kafka replay speed: add support for metadata & source #9287

Merged

4 tasks

gotjosh marked this pull request as ready for review September 18, 2024 14:39

gotjosh requested a review from a team as a code owner September 18, 2024 14:39

dimitarvdimitrov reviewed Sep 18, 2024

View reviewed changes

pkg/storage/ingest/pusher.go Outdated Show resolved Hide resolved

pkg/storage/ingest/pusher_test.go Outdated Show resolved Hide resolved

pkg/storage/ingest/pusher_test.go Show resolved Hide resolved

gotjosh force-pushed the gotjosh/ingester/refactor-test branch from a55e417 to 08a724a Compare September 19, 2024 15:59

gotjosh force-pushed the gotjosh/ingester/refactor-test branch from 08a724a to 8a8a7c2 Compare September 19, 2024 16:03

gotjosh merged commit 16f8a12 into dimitar/ingester/consume-latency-push-sharding Sep 19, 2024
27 of 29 checks passed

gotjosh deleted the gotjosh/ingester/refactor-test branch September 19, 2024 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the `PusherConsumer` interactions #9133

Refactor the `PusherConsumer` interactions #9133

gotjosh commented Aug 29, 2024 •

edited

Loading

gotjosh Aug 29, 2024

dimitarvdimitrov left a comment

dimitarvdimitrov left a comment

dimitarvdimitrov left a comment

Refactor the PusherConsumer interactions #9133

Refactor the PusherConsumer interactions #9133

Conversation

gotjosh commented Aug 29, 2024 • edited Loading

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

gotjosh Aug 29, 2024

Choose a reason for hiding this comment

dimitarvdimitrov left a comment

Choose a reason for hiding this comment

dimitarvdimitrov left a comment

Choose a reason for hiding this comment

dimitarvdimitrov left a comment

Choose a reason for hiding this comment

Refactor the `PusherConsumer` interactions #9133

Refactor the `PusherConsumer` interactions #9133

gotjosh commented Aug 29, 2024 •

edited

Loading