Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Cortex querier to use Queryable interface directly #2060

Merged
merged 15 commits into from
Feb 17, 2020
Merged

Refactor Cortex querier to use Queryable interface directly #2060

merged 15 commits into from
Feb 17, 2020

Conversation

pstibrany
Copy link
Contributor

This PR changes Cortex querier to use Queryable interface as its primary interface it builds on. Previously, Cortex querier was tied to Cortex chunks store, but this has problems when using it with TSDB blocks: Blocks Querier had to reencode TSDB chunks to Cortex chunks, and also ingester streaming support (PR #1845) would need to do the same.

Chunk-based queryables are still optimized in the way they were before (in mergeSeriesSets). The requirement is that they return series that implement new SeriesWithChunks interface, which gives new mergeSeriesSets method an option to use optimizations.

This PR also removes some duplicity in the code. First by merging ingesterQueryable (used when streaming ingester flag is set) into distributorQuerier, which now simply has a flag that decides whether streaming should be used or not. Second, logic from unifiedChunkQuerier was moved into querier (merging from multiple chunks-based stores), most of the code was already there anyway.

This PR has a follow-up PR that modifies BlocksQuerier to avoid TSDB -> Cortex chunks conversion.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pstibrany for this refactoring! Looking at this refactoring it's quite clear how much we need it, to make the read path easier to maintain and evolve (ie. see the struggles we're experiencing with the blocks storage). I left few comments here and there, but the design looks good.

pkg/querier/chunk_store_queryable.go Outdated Show resolved Hide resolved
pkg/querier/chunk_store_queryable_test.go Outdated Show resolved Hide resolved
pkg/querier/distributor_queryable.go Show resolved Hide resolved
pkg/querier/querier.go Outdated Show resolved Hide resolved
pkg/querier/querier.go Outdated Show resolved Hide resolved
pkg/querier/querier.go Show resolved Hide resolved
}
return metricsToSeriesSet(ms), nil, nil

otherSets = append(otherSets, chunksSet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input sets of NewMergeSeriesSet must be sorted. How can we guarantee that appending otherSets and chunksSet together is still sorted?

Copy link
Contributor Author

@pstibrany pstibrany Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's series in individual sets that must be sorted in order for merging to work properly.

Series in chunk-based sets are sorted (guaranteed by partitionChunks).

Non-chunk-based sets that we use are:

  1. values from ingester when not using streaming – these are returned via newConcreteSeriesSet, which sorts the series by label,
  2. after Blocks querier: don't reencode TSDB chunks to Cortex chunks during querying #2061 – series set returned by blocks storage, fortunately it also guarantees order of the series.

pkg/querier/querier.go Show resolved Hide resolved
pkg/querier/distributor_queryable.go Show resolved Hide resolved
pkg/querier/distributor_queryable.go Show resolved Hide resolved
@pstibrany
Copy link
Contributor Author

Thank you Marco for your review. I believe I've addressed all your comments now.

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some comments, mostly about naming and style.
Would like to see the benchmark results before and after, to check it hasn't introduced a performance regression. Particularly in light of #1195

pkg/querier/chunk_store_queryable.go Outdated Show resolved Hide resolved
pkg/querier/distributor_queryable.go Outdated Show resolved Hide resolved
if sp != nil {
mint = sp.Start
maxt = sp.End
// Kludge: Prometheus passes nil SelectParams if it is doing a 'series' operation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify who is kludging what? I think it's important that Cortex is relying on a behaviour of Prometheus, to make a decision that helps minimise effort at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this comment from querier type, which had direct reference to distributor previously.

Any suggestion on how to improve/change this comment? It seems clear to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just remove the Kludge:? :) All in all, we're just describing how Prometheus works and why we handle the special case of sp == nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this documented anywhere else, so to me Kludge: indicates that this is possibly a hack that can break :)

pkg/querier/querier.go Outdated Show resolved Hide resolved
pkg/util/merger.go Show resolved Hide resolved
pkg/querier/chunk_store_queryable_test.go Show resolved Hide resolved
pkg/querier/lazy_querier.go Show resolved Hide resolved
pkg/querier/querier.go Outdated Show resolved Hide resolved
pkg/querier/querier.go Outdated Show resolved Hide resolved
@pstibrany
Copy link
Contributor Author

pstibrany commented Feb 3, 2020

Would like to see the benchmark results before and after, to check it hasn't introduced a performance regression. Particularly in light of #1195

Thanks Bryan for your review. Here are benchmarks of this PR against master: https://gist.github.com/pstibrany/6418ad45e9c982a9e10950c4ad8f12a8

Problem with this benchmark is that many tests don't run long enough to reasonably compare them. I will try to get a better one. (And also that I run them on a rented shared virtual host to avoid blocking my main machine for a long time :-()

pkg/querier/querier.go Outdated Show resolved Hide resolved
pkg/querier/querier.go Show resolved Hide resolved
@pstibrany
Copy link
Contributor Author

I've rerun benchmarks with -test.run '^$' -test.bench '^Bench.*' -test.v -test.benchmem -test.benchtime=10s parameters.

https://gist.github.com/pstibrany/96b1509429714605bf59382640bc6032

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tripled checked this PR and LGTM. I left a couple of nits, but nothing is blocking.

if sp != nil {
mint = sp.Start
maxt = sp.End
// Kludge: Prometheus passes nil SelectParams if it is doing a 'series' operation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just remove the Kludge:? :) All in all, we're just describing how Prometheus works and why we handle the special case of sp == nil.

pkg/querier/querier.go Outdated Show resolved Hide resolved
pkg/querier/querier.go Show resolved Hide resolved
Copy link
Contributor

@thorfour thorfour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks for doing this work!

This makes it easier to use non-chunk-based queryables. On the other
hand, chunk-based queryables are still optimized in the way they were
before (in mergeSeriesSets).

Merged ingester_streaming_queryable.go into distributor_queryable.go.
Merged unified_querier.go into querier.go.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
…erier.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
…s methods.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
@pracucci pracucci merged commit b81d808 into cortexproject:master Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants