Description
What
This issue describes the current problems facing an efficient implementation of the Prometheus Series API and a possible path towards mitigating them.
Background
Currently Prometheus supports a Series API which basically has the type Matchers -> [LabelsSet]
.
Cortex has a special case for this -- it only queries ingesters because the start/end
times are not included in the Querier.Select
call. This behavior makes sense; it's infeasible to query all series across all time ranges in Cortex.
Possible implementation path
- The
Queryable
interface already has support for bounding the time range. We should be able to use this. - We can then ignore the fact that
*SelectParams
are not passed to theQuerier.Select
call as these boundaries are already encoded via theQueryable.Querier
invocation.
This approach also raises some new problems: Internally, we resolve a SeriesSet
by resolving all chunks for those matchers/time range. This makes sense when we're looking for the timeseries data inside chunks, but is wasteful if we're only concerned with the series themselves.
Starting in the v9 schema, SeriesID
s are encoded in the index. There are even unexported functions lookupSeriesByMetricNameMatchers and lookupChunksBySeries to support this lookup. Therefore, we can extend the chunk.Store
interface by exporting these. Then, we'd only need to pull one chunk per series instead of every chunk in the time range in order to check the labels.
The combination of time bounding /series
lookups and only needing to pull one chunk per series to check labels should make Series API parity attainable.
Context
Most of this comes from (naively) implementing the Series API in Loki: grafana/loki#1419