-
Notifications
You must be signed in to change notification settings - Fork 802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Cortex querier to use Queryable interface directly #2060
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pstibrany for this refactoring! Looking at this refactoring it's quite clear how much we need it, to make the read path easier to maintain and evolve (ie. see the struggles we're experiencing with the blocks storage). I left few comments here and there, but the design looks good.
} | ||
return metricsToSeriesSet(ms), nil, nil | ||
|
||
otherSets = append(otherSets, chunksSet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input sets of NewMergeSeriesSet
must be sorted. How can we guarantee that appending otherSets
and chunksSet
together is still sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's series in individual sets that must be sorted in order for merging to work properly.
Series in chunk-based sets are sorted (guaranteed by partitionChunks
).
Non-chunk-based sets that we use are:
- values from ingester when not using streaming – these are returned via
newConcreteSeriesSet
, which sorts the series by label, - after Blocks querier: don't reencode TSDB chunks to Cortex chunks during querying #2061 – series set returned by blocks storage, fortunately it also guarantees order of the series.
Thank you Marco for your review. I believe I've addressed all your comments now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some comments, mostly about naming and style.
Would like to see the benchmark results before and after, to check it hasn't introduced a performance regression. Particularly in light of #1195
if sp != nil { | ||
mint = sp.Start | ||
maxt = sp.End | ||
// Kludge: Prometheus passes nil SelectParams if it is doing a 'series' operation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify who is kludging what? I think it's important that Cortex is relying on a behaviour of Prometheus, to make a decision that helps minimise effort at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this comment from querier
type, which had direct reference to distributor previously.
Any suggestion on how to improve/change this comment? It seems clear to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just remove the Kludge:
? :) All in all, we're just describing how Prometheus works and why we handle the special case of sp == nil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this documented anywhere else, so to me Kludge:
indicates that this is possibly a hack that can break :)
Thanks Bryan for your review. Here are benchmarks of this PR against master: https://gist.github.com/pstibrany/6418ad45e9c982a9e10950c4ad8f12a8 Problem with this benchmark is that many tests don't run long enough to reasonably compare them. I will try to get a better one. (And also that I run them on a rented shared virtual host to avoid blocking my main machine for a long time :-() |
I've rerun benchmarks with https://gist.github.com/pstibrany/96b1509429714605bf59382640bc6032 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tripled checked this PR and LGTM. I left a couple of nits, but nothing is blocking.
if sp != nil { | ||
mint = sp.Start | ||
maxt = sp.End | ||
// Kludge: Prometheus passes nil SelectParams if it is doing a 'series' operation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just remove the Kludge:
? :) All in all, we're just describing how Prometheus works and why we handle the special case of sp == nil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks for doing this work!
This makes it easier to use non-chunk-based queryables. On the other hand, chunk-based queryables are still optimized in the way they were before (in mergeSeriesSets). Merged ingester_streaming_queryable.go into distributor_queryable.go. Merged unified_querier.go into querier.go. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
…erier. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
…s methods. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
This PR changes Cortex querier to use Queryable interface as its primary interface it builds on. Previously, Cortex querier was tied to Cortex chunks store, but this has problems when using it with TSDB blocks: Blocks Querier had to reencode TSDB chunks to Cortex chunks, and also ingester streaming support (PR #1845) would need to do the same.
Chunk-based queryables are still optimized in the way they were before (in mergeSeriesSets). The requirement is that they return series that implement new
SeriesWithChunks
interface, which gives newmergeSeriesSets
method an option to use optimizations.This PR also removes some duplicity in the code. First by merging
ingesterQueryable
(used when streaming ingester flag is set) intodistributorQuerier
, which now simply has a flag that decides whether streaming should be used or not. Second, logic fromunifiedChunkQuerier
was moved intoquerier
(merging from multiple chunks-based stores), most of the code was already there anyway.This PR has a follow-up PR that modifies BlocksQuerier to avoid TSDB -> Cortex chunks conversion.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]