Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series API impl + index support #2313

Closed
owen-d opened this issue Mar 20, 2020 · 8 comments
Closed

Series API impl + index support #2313

owen-d opened this issue Mar 20, 2020 · 8 comments
Labels

Comments

@owen-d
Copy link
Contributor

owen-d commented Mar 20, 2020

What

This issue describes the current problems facing an efficient implementation of the Prometheus Series API and a possible path towards mitigating them.

/cc @gouthamve @cyriltovena

Background

Currently Prometheus supports a Series API which basically has the type Matchers -> [LabelsSet].

Cortex has a special case for this -- it only queries ingesters because the start/end times are not included in the Querier.Select call. This behavior makes sense; it's infeasible to query all series across all time ranges in Cortex.

Possible implementation path

  1. The Queryable interface already has support for bounding the time range. We should be able to use this.
  2. We can then ignore the fact that *SelectParams are not passed to the Querier.Select call as these boundaries are already encoded via the Queryable.Querier invocation.

This approach also raises some new problems: Internally, we resolve a SeriesSet by resolving all chunks for those matchers/time range. This makes sense when we're looking for the timeseries data inside chunks, but is wasteful if we're only concerned with the series themselves.

Starting in the v9 schema, SeriesIDs are encoded in the index. There are even unexported functions lookupSeriesByMetricNameMatchers and lookupChunksBySeries to support this lookup. Therefore, we can extend the chunk.Store interface by exporting these. Then, we'd only need to pull one chunk per series instead of every chunk in the time range in order to check the labels.

The combination of time bounding /series lookups and only needing to pull one chunk per series to check labels should make Series API parity attainable.

Context

Most of this comes from (naively) implementing the Series API in Loki: grafana/loki#1419

@pracucci
Copy link
Contributor

Cortex has a special case for this -- it only queries distributors

I guess you mean "it only queries ingesters". We query the series in the ingesters memory. The fact we're going through the distributors (running within the queriers) is a tech debt we have since the past and we should get rid of (that code should be moved to queriers and queriers should not run distributors internally).

@pracucci
Copy link
Contributor

There are even unexported functions lookupSeriesByMetricNameMatchers and lookupChunksBySeries to support this lookup. Therefore, we can extend the chunk.Store interface by exporting these.

I'm not sure about exporting these functions, instead of adding a single higher level function to chunk.Store like GetSeries(ctx context.Context, userID string, from, through model.Time, matchers ...*labels.Matcher). Then it's the underlying implementation that can use lookupSeriesByMetricNameMatchers and lookupChunksBySeries if required. In this scenario, Get() could be renamed to GetChunks() for clarity.

@owen-d
Copy link
Contributor Author

owen-d commented Mar 23, 2020

Cortex has a special case for this -- it only queries distributors

Eh yeah, that's a bit embarrassing. I meant ingesters 😱

@pracucci
Copy link
Contributor

This issue so far is covering the chunks storage. I'm very interested into extending the discussion to the blocks storage too, where we could contribute with the implementation.

@pracucci
Copy link
Contributor

How does this proposal play with the interval splitting done by the query-frontend? I guess no interval splitting will be done and the query will be executed by 1 single querier. Is it correct?

@owen-d
Copy link
Contributor Author

owen-d commented Mar 30, 2020

I don't think it should be aversely affected by splitting in the query-frontend. We could split it in the query frontend and recombine them or distribute it to one querier. It may be easier to start on one querier and then refactor to fan-out if needed.

@owen-d
Copy link
Contributor Author

owen-d commented Apr 8, 2020

Note, we were able to improve the implementation in loki by using the chunk fingerprints to partition (series,chunks) groupings in grafana/loki#1914. This is likely a reusable approach for cortex.

@stale
Copy link

stale bot commented Jun 7, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 7, 2020
@stale stale bot closed this as completed Jun 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants