-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Series API impl + index support #2313
Comments
I guess you mean "it only queries ingesters". We query the series in the ingesters memory. The fact we're going through the distributors (running within the queriers) is a tech debt we have since the past and we should get rid of (that code should be moved to queriers and queriers should not run distributors internally). |
I'm not sure about exporting these functions, instead of adding a single higher level function to |
Eh yeah, that's a bit embarrassing. I meant ingesters 😱 |
This issue so far is covering the chunks storage. I'm very interested into extending the discussion to the blocks storage too, where we could contribute with the implementation. |
How does this proposal play with the interval splitting done by the |
I don't think it should be aversely affected by splitting in the |
Note, we were able to improve the implementation in loki by using the chunk fingerprints to partition |
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions. |
What
This issue describes the current problems facing an efficient implementation of the Prometheus Series API and a possible path towards mitigating them.
/cc @gouthamve @cyriltovena
Background
Currently Prometheus supports a Series API which basically has the type
Matchers -> [LabelsSet]
.Cortex has a special case for this -- it only queries ingesters because the
start/end
times are not included in theQuerier.Select
call. This behavior makes sense; it's infeasible to query all series across all time ranges in Cortex.Possible implementation path
Queryable
interface already has support for bounding the time range. We should be able to use this.*SelectParams
are not passed to theQuerier.Select
call as these boundaries are already encoded via theQueryable.Querier
invocation.This approach also raises some new problems: Internally, we resolve a
SeriesSet
by resolving all chunks for those matchers/time range. This makes sense when we're looking for the timeseries data inside chunks, but is wasteful if we're only concerned with the series themselves.Starting in the v9 schema,
SeriesID
s are encoded in the index. There are even unexported functions lookupSeriesByMetricNameMatchers and lookupChunksBySeries to support this lookup. Therefore, we can extend thechunk.Store
interface by exporting these. Then, we'd only need to pull one chunk per series instead of every chunk in the time range in order to check the labels.The combination of time bounding
/series
lookups and only needing to pull one chunk per series to check labels should make Series API parity attainable.Context
Most of this comes from (naively) implementing the Series API in Loki: grafana/loki#1419
The text was updated successfully, but these errors were encountered: