feat: Add `output_bytes` to baseline metrics #18268

2010YOUY01 · 2025-10-24T13:38:50Z

Which issue does this PR close?

Closes Add output_bytes metrics to Explain Analyze #16244

Rationale for this change

Support output_bytes in BaselineMetrics (a common metrics set for almost all operators)

DataFusion CLI v50.3.0
> explain analyze select * from generate_series(1, 1000000) as t1(v1) order by v1 desc;
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                            |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=1000000, elapsed_compute=96.421534ms, output_bytes=7.6 MB, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, batches_split=0] |
|                   |   ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=1000000, elapsed_compute=34.125µs, output_bytes=7.7 MB]                                                                                            |
|                   |     LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=1, end=1000000, batch_size=8192], metrics=[output_rows=1000000, elapsed_compute=2.262626ms, output_bytes=7.7 MB]                     |
|                   |                                                                                                                                                                                                                 |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.080 seconds.

Note it might overestimate memory due to a well-known issue. See the PR snippet for details

    /// Memory usage of all output batches.
    ///
    /// Note: This value may be overestimated. If multiple output `RecordBatch`
    /// instances share underlying memory buffers, their sizes will be counted
    /// multiple times.
    /// Issue: <https://github.com/apache/datafusion/issues/16841>
    output_bytes: Count,

I think this metric provides valuable insight, so it's better for it to overestimate than not exist at all.

What changes are included in this PR?

Add output_bytes to BaselineMetrics, and it's set to summary analyze level. (see config datafusion.explain.analyze_level for details)
This metrics will be automatically tracked through record_poll() API, which is a common interface most operators uses when a new output batch is generated.

Are these changes tested?

UT

Are there any user-facing changes?

comphead · 2025-10-26T05:20:02Z

datafusion/physical-plan/src/metrics/baseline.rs

+    /// multiple times.
+    /// Issue: <https://github.com/apache/datafusion/issues/16841>
+    output_bytes: Count,
+    // Remember to update `docs/source/user-guide/metrics.md` when updating comments


👍 hope people read it before making changes

comphead

thanks @2010YOUY01

## Which issue does this PR close?  - Closes apache#16244 ## Rationale for this change  Support `output_bytes` in `BaselineMetrics` (a common metrics set for almost all operators) ``` DataFusion CLI v50.3.0 > explain analyze select * from generate_series(1, 1000000) as t1(v1) order by v1 desc; +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=1000000, elapsed_compute=96.421534ms, output_bytes=7.6 MB, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, batches_split=0] | | | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=1000000, elapsed_compute=34.125µs, output_bytes=7.7 MB] | | | LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=1, end=1000000, batch_size=8192], metrics=[output_rows=1000000, elapsed_compute=2.262626ms, output_bytes=7.7 MB] | | | | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.080 seconds. ``` Note it might overestimate memory due to a well-known issue. See the PR snippet for details ```rs /// Memory usage of all output batches. /// /// Note: This value may be overestimated. If multiple output `RecordBatch` /// instances share underlying memory buffers, their sizes will be counted /// multiple times. /// Issue: <apache#16841> output_bytes: Count, ``` I think this metric provides valuable insight, so it's better for it to overestimate than not exist at all. ## What changes are included in this PR?  1. Add `output_bytes` to `BaselineMetrics`, and it's set to `summary` analyze level. (see config `datafusion.explain.analyze_level` for details) 2. This metrics will be automatically tracked through `record_poll()` API, which is a common interface most operators uses when a new output batch is generated. ## Are these changes tested? UT  ## Are there any user-facing changes?

## Which issue does this PR close?  - Closes #17027 ## Rationale for this change  `output_batches` should be a common metric in all operators, thus should ideally be added to `BaselineMetrics` ``` > explain analyze select * from generate_series(1, 1000000) as t1(v1) order by v1 desc; +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=1000000, elapsed_compute=535.320324ms, output_bytes=7.6 MB, output_batches=123, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, batches_split=0] | | | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=1000000, elapsed_compute=208.379µs, output_bytes=7.7 MB, output_batches=123] | | | LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=1, end=1000000, batch_size=8192], metrics=[output_rows=1000000, elapsed_compute=15.924291ms, output_bytes=7.7 MB, output_batches=123] | | | | +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.492 second ``` ## What changes are included in this PR?  - Added `output_batches` into `BaselineMetrics` with `DEV` MetricType - Tracked through `record_poll()` API - Changes are similar to #18268 - Refactored `assert_metrics` macro to take multiple metrics strings for substring check - Added `output_bytes` and `output_batches` tracking in `TopK` operator - Added `baseline` metrics for `RepartitionExec` ## Are these changes tested?  Added UT ## Are there any user-facing changes?   Changes in the `EXPLAIN ANALYZE` output, `output_batches` will be added to `metrics=[...]`

## Which issue does this PR close?  - Closes apache#16244 ## Rationale for this change  Support `output_bytes` in `BaselineMetrics` (a common metrics set for almost all operators) ``` DataFusion CLI v50.3.0 > explain analyze select * from generate_series(1, 1000000) as t1(v1) order by v1 desc; +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=1000000, elapsed_compute=96.421534ms, output_bytes=7.6 MB, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, batches_split=0] | | | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=1000000, elapsed_compute=34.125µs, output_bytes=7.7 MB] | | | LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=1, end=1000000, batch_size=8192], metrics=[output_rows=1000000, elapsed_compute=2.262626ms, output_bytes=7.7 MB] | | | | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.080 seconds. ``` Note it might overestimate memory due to a well-known issue. See the PR snippet for details ```rs /// Memory usage of all output batches. /// /// Note: This value may be overestimated. If multiple output `RecordBatch` /// instances share underlying memory buffers, their sizes will be counted /// multiple times. /// Issue: <apache#16841> output_bytes: Count, ``` I think this metric provides valuable insight, so it's better for it to overestimate than not exist at all. ## What changes are included in this PR?  1. Add `output_bytes` to `BaselineMetrics`, and it's set to `summary` analyze level. (see config `datafusion.explain.analyze_level` for details) 2. This metrics will be automatically tracked through `record_poll()` API, which is a common interface most operators uses when a new output batch is generated. ## Are these changes tested? UT  ## Are there any user-facing changes?

…18491) ## Which issue does this PR close?  - Closes apache#17027 ## Rationale for this change  `output_batches` should be a common metric in all operators, thus should ideally be added to `BaselineMetrics` ``` > explain analyze select * from generate_series(1, 1000000) as t1(v1) order by v1 desc; +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=1000000, elapsed_compute=535.320324ms, output_bytes=7.6 MB, output_batches=123, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, batches_split=0] | | | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=1000000, elapsed_compute=208.379µs, output_bytes=7.7 MB, output_batches=123] | | | LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=1, end=1000000, batch_size=8192], metrics=[output_rows=1000000, elapsed_compute=15.924291ms, output_bytes=7.7 MB, output_batches=123] | | | | +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.492 second ``` ## What changes are included in this PR?  - Added `output_batches` into `BaselineMetrics` with `DEV` MetricType - Tracked through `record_poll()` API - Changes are similar to apache#18268 - Refactored `assert_metrics` macro to take multiple metrics strings for substring check - Added `output_bytes` and `output_batches` tracking in `TopK` operator - Added `baseline` metrics for `RepartitionExec` ## Are these changes tested?  Added UT ## Are there any user-facing changes?   Changes in the `EXPLAIN ANALYZE` output, `output_batches` will be added to `metrics=[...]`

add output_bytes to baseline metrics

5b21984

github-actions bot added documentation Improvements or additions to documentation core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Oct 24, 2025

comphead reviewed Oct 26, 2025

View reviewed changes

comphead approved these changes Oct 26, 2025

View reviewed changes

2010YOUY01 added this pull request to the merge queue Oct 27, 2025

Merged via the queue into apache:main with commit 4ecccde Oct 27, 2025
33 checks passed

nmbr7 mentioned this pull request Nov 5, 2025

refactor: include metric output_batches into BaselineMetrics #18491

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add `output_bytes` to baseline metrics #18268

feat: Add `output_bytes` to baseline metrics #18268

Uh oh!

2010YOUY01 commented Oct 24, 2025

Uh oh!

comphead Oct 26, 2025

Uh oh!

comphead left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add output_bytes to baseline metrics #18268

feat: Add output_bytes to baseline metrics #18268

Uh oh!

Conversation

2010YOUY01 commented Oct 24, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

comphead Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add `output_bytes` to baseline metrics #18268

feat: Add `output_bytes` to baseline metrics #18268