-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass metrics object for Scan, Timeseries and GroupBy queries during cursor creation #12484
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,7 @@ | |
import org.apache.druid.query.QueryRunnerFactory; | ||
import org.apache.druid.query.context.ResponseContext; | ||
import org.apache.druid.query.groupby.GroupByQuery; | ||
import org.apache.druid.query.groupby.GroupByQueryMetrics; | ||
import org.apache.druid.query.groupby.GroupByQueryQueryToolChest; | ||
import org.apache.druid.query.groupby.ResultRow; | ||
import org.apache.druid.query.groupby.resource.GroupByQueryResource; | ||
|
@@ -164,7 +165,7 @@ Sequence<ResultRow> processSubtotalsSpec( | |
/** | ||
* Merge a variety of single-segment query runners into a combined runner. Used by | ||
* {@link org.apache.druid.query.groupby.GroupByQueryRunnerFactory#mergeRunners(QueryProcessingPool, Iterable)}. In | ||
* that sense, it is intended to go along with {@link #process(GroupByQuery, StorageAdapter)} (the runners created | ||
* that sense, it is intended to go along with {@link #process(GroupByQuery, StorageAdapter, GroupByQueryMetrics)} (the runners created | ||
* by that method will be fed into this method). | ||
* <p> | ||
* This method is only called on data servers, like Historicals (not the Broker). | ||
|
@@ -187,7 +188,10 @@ Sequence<ResultRow> processSubtotalsSpec( | |
* | ||
* @return result sequence for the storage adapter | ||
*/ | ||
Sequence<ResultRow> process(GroupByQuery query, StorageAdapter storageAdapter); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason not to wire in the GroupByQueryMetrics in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't do it in those methods since they take the subquery/subtotal result sequence in the method. I think the inlined subquery should execute using a separate query object and hence should emit their own metrics object. For pushed down subqueries, I'm not sure about their metrics framework as of now but my guess is that it should use the same metrics object as the outer one. will check and update There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is there a test we can implement easily to validate this? If so, that would be great! |
||
Sequence<ResultRow> process( | ||
GroupByQuery query, | ||
StorageAdapter storageAdapter, | ||
@Nullable GroupByQueryMetrics groupByQueryMetrics); | ||
|
||
/** | ||
* Returns whether this strategy supports pushing down outer queries. This is used by | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,6 +41,7 @@ | |
import org.apache.druid.query.groupby.GroupByQueryConfig; | ||
import org.apache.druid.query.groupby.GroupByQueryEngine; | ||
import org.apache.druid.query.groupby.GroupByQueryHelper; | ||
import org.apache.druid.query.groupby.GroupByQueryMetrics; | ||
import org.apache.druid.query.groupby.GroupByQueryQueryToolChest; | ||
import org.apache.druid.query.groupby.ResultRow; | ||
import org.apache.druid.query.groupby.orderby.NoopLimitSpec; | ||
|
@@ -51,6 +52,7 @@ | |
import org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter; | ||
import org.joda.time.Interval; | ||
|
||
import javax.annotation.Nullable; | ||
import java.util.ArrayList; | ||
import java.util.HashSet; | ||
import java.util.Set; | ||
|
@@ -233,7 +235,8 @@ public Sequence<ResultRow> apply(Interval interval) | |
outerQuery.withQuerySegmentSpec( | ||
new MultipleIntervalSegmentSpec(ImmutableList.of(interval)) | ||
), | ||
new IncrementalIndexStorageAdapter(innerQueryResultIndex) | ||
new IncrementalIndexStorageAdapter(innerQueryResultIndex), | ||
null | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not pass in queryMetrics here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I passed null here since passing the query metrics needed changes in the interface - and since this execution is synthetic over the inner query results and would always be non-vectorized (due to the storage adapter used) I skipped it. If you think, I can try to make that change There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think passing this object around would be useful in the future for adding other dimensions. For example, let's say we wanted to know how many group by queries are executed using the v1 strategy vs the v2 strategy, then making the interface changes to pass in the queryMetrics object could be useful. Since the scope of this PR is just to add better support for seeing whether or not queries are vectorized, I think it isn't needed to do this as part of this change, but it could be done in a future change |
||
); | ||
} | ||
} | ||
|
@@ -269,10 +272,14 @@ public QueryRunner<ResultRow> mergeRunners( | |
} | ||
|
||
@Override | ||
public Sequence<ResultRow> process(final GroupByQuery query, final StorageAdapter storageAdapter) | ||
public Sequence<ResultRow> process( | ||
final GroupByQuery query, | ||
final StorageAdapter storageAdapter, | ||
@Nullable final GroupByQueryMetrics groupByQueryMetrics | ||
) | ||
{ | ||
return Sequences.map( | ||
engine.process(query, storageAdapter), | ||
engine.process(query, storageAdapter, groupByQueryMetrics), | ||
row -> GroupByQueryHelper.toResultRow(query, row) | ||
); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you verify the metrics returned by calling
engine.process(..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the verification is done in *QueryRunnerTests for all the query types. This is mostly to exercise the metrics code more