-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kafka replay speed: add bytes limit for inflight fetch requests #9892
kafka replay speed: add bytes limit for inflight fetch requests #9892
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much, @dimitarvdimitrov !
@@ -1043,6 +1084,12 @@ func newReaderMetrics(partitionID int32, reg prometheus.Registerer, bufferedReco | |||
lastConsumedOffset: lastConsumedOffset, | |||
kprom: NewKafkaReaderClientMetrics(component, reg), | |||
} | |||
|
|||
m.Service = services.NewTimerService(100*time.Millisecond, nil, func(context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see anyone starting/stopping this service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's started here
mimir/pkg/storage/ingest/reader.go
Lines 219 to 228 in 314647e
r.dependencies, err = services.NewManager(r.committer, r.offsetReader, r.consumedOffsetWatcher, startOffsetReader, r.metrics) | |
if err != nil { | |
return errors.Wrap(err, "creating service manager") | |
} | |
// Use context.Background() because we want to stop all dependencies when the PartitionReader stops | |
// instead of stopping them when ctx is cancelled and while the PartitionReader is still running. | |
err = services.StartManagerAndAwaitHealthy(context.Background(), r.dependencies) | |
if err != nil { | |
return errors.Wrap(err, "starting service manager") | |
} |
and then stopped here
mimir/pkg/storage/ingest/reader.go
Lines 280 to 285 in 314647e
func (r *PartitionReader) stopDependencies() error { | |
if r.dependencies != nil { | |
if err := services.StopManagerAndAwaitStopped(context.Background(), r.dependencies); err != nil { | |
return errors.Wrap(err, "stopping service manager") | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work and nice tests! Pre-approving, but please remember to start/stop the reader metrics service.
nextFetch = nextFetch.Next(recordsPerFetch) | ||
|
||
case result, moreLeft := <-refillBufferedResult: | ||
if !moreLeft { | ||
if pendingResults.Len() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've lost this if
check (removeNextResult
assumes the list is non empty). Is it a problem?
To my understanding it's not a problem because, with the new logic, we have the guarantee that if refillBufferedResult
is valued that there's at least 1 item in the list (the fetch we're currently reading from, which is set to refillBufferedResult
itself). Is my understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's right. Now instead of keeping nextResult
as state we compute it on every iteration. The invariants from before still hold
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com> � Conflicts: � pkg/storage/ingest/fetcher.go � pkg/storage/ingest/fetcher_test.go � pkg/storage/ingest/reader.go
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
07ed98b
to
65b3769
Compare
Problem
OOMs happen because the
concurrency * records-per-fetch
product is too large. We hold too many records in memory and don't consume them as fast as we fetch them. The goal is to have a setting to control the memory consumption of the process. In the future it may be scaled with the ingester's memory request for example.Proposal
Instead of configuring records-per-fetch we should configure the maximum memory for all fetchers. We can work backwards to devise the number of records per
fetchWant
from the number of concurrent fetchers.I propose to remove these flags
and replace them with a single flag
Each
fetchWant
still needs to have a well definedstartOffset
andendOffset
. The number of records of each fetchWant will be($max_inflight_bytes / $startup_concurrency) / $bytes_per_record
. We already track the averagebytes_per_record
of the last few fetches, but the value there can be volatile. Maybe we can also invest in a better tracking of average bytes.What this PR does
This PR is the first step to doing the above: adding a limit for the inflight bytes.
The limit works by controlling the
fetchWant
s. We don't dispatch afetchWant
if its MaxBytes would exceed the limit. The idea is that MaxBytes of the request are usually a good estimation of how much data we have to fetch.This does have a caveat that if estimation is bad, we'd continue and still risk OOMing.
I didn't add two separate limits for startup and ongoing because we're currently setting them both to the same value and the short/medium-term plan is to only have a single config option.
Next steps
The next step would be to remove the records per fetch config options altogether and only rely on max bytes.
Note to reviewers
This is based on #9891 because that PR fixes flaky tests. The changes aren't otherwise dependant.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.