-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BaselineMetrics, Timestamp metrics, add for CoalescePartitionsExec
, rename output_time -> elapsed_compute
#909
Conversation
f4eaac9
to
248d113
Compare
CoalescePartitionsExec
8bce308
to
f6ad883
Compare
@@ -43,12 +45,17 @@ use pin_project_lite::pin_project; | |||
pub struct CoalescePartitionsExec { | |||
/// Input execution plan | |||
input: Arc<dyn ExecutionPlan>, | |||
/// Execution metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the pattern that instrumented operators now follow:
- They have a new
metrics: ExecutionPlanMetricsSet
field - During
execute()
they create newBaselineMetrics
- They add
elapsed_compute
timer during their CPU intensive work
b6436ac
to
3c7b845
Compare
3c7b845
to
e5c5c07
Compare
e5c5c07
to
2874261
Compare
CoalescePartitionsExec
CoalescePartitionsExec
, rename output_time -> elapsed_compute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can split this PR into smaller parts for easier review (introduce Timestamp
rename to ElapsedCompute
and add BaselineMetrics
but I have them all together in a single PR to show how they work together to record the basic information.
|
||
use super::{Count, ExecutionPlanMetricsSet, MetricBuilder, Time, Timestamp}; | ||
|
||
/// Helper for creating and tracking common "baseline" metrics for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the core new structure / wrapper as suggested by @houqp -- it doesn't quite wrap everything yet but I think it makes annotating basic information pretty simple
I have a few PRs backed up on this one (e.g. #938) so if someone has time to review this PR in the near term I would be most appreciative 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed over this but LGTM
I'll give it until tomorrow in case anyone else wants to chime in or say they are interested in reviewing it . Thanks @andygrove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a great start 👍
Thanks all for the input |
Which issue does this PR close?
First part of #866
(see #938 and https://github.com/influxdata/influxdb_iox/pull/2387 for examples of how using this API works)
Rationale for this change
I would like to be able get an overall understanding of where time is being spent during query execution via
EXPLAIN ANALYZE
(see #858) so that I know where to focuse additional performance optimization activitiesAdditionally, I would like to be able to graph a stacked flamechart such as the following see more details on https://github.com/influxdata/influxdb_iox/issues/2273) that shows when the different operators ran in relation to each other.
What changes are included in this PR?
Begin adding the following data for each operator, as it makes sense
execute
was runChanges
cpu_time
toelapsed_compute
and documents clearly what it is and is not (as @mkmik pointed out that traditionally CPU time refer to a specific type of clock that is not what is implemented here)BaselineMetrics
structure that has common metrics to assist in annotatingTimestamp
metric type andStartTimestamp
andEndTimestamp
values and associated aggregating codeAre there any user-facing changes?
Better
ANALYZE
output.Using this setup
Run CLI
And then run this SQL:
Before this PR
After this PR
(note the presence of
output_rows
,elapsed_compute
,start_timestamp
, andend_timestamp
)If you use
VERBOSE
you can also see the start / end timestamps among many other thingsFollow on work:
BaselineMetrics
is a start in this direction but there is still more to go