Improve ingester reads #1550
Labels
component/loki
keepalive
An issue or PR that will be kept alive and never marked as stale.
type/enhancement
Something existing could be improved
When running queries across high throughput streams, we can see that deduplications from ingester data in querier is hurting a lot and time consuming.
For example one query ran for 38s with those logs as information:
As you can see, ingesters seems to send tons of duplicates, causing slow down.
Currently we query ingesters for the whole time range of the request.
I propose that we find the latest chunk time for each streams in the storage, and use that as part of the query for the ingesters, minimising result sent from ingesters to only what we don't have from the storage.
With this map of metric name to start time, we should be able to build a different stream iterator here
https://github.com/grafana/loki/blob/master/pkg/ingester/instance.go#L203 that would have a better start time.
/cc @gouthamve @slim-bean @owen-d
Although it should be noted that this might not improve performance by a lot since we run with:
The text was updated successfully, but these errors were encountered: