Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks to window function queries #564

Merged
merged 1 commit into from
Jun 22, 2021

Conversation

jimexist
Copy link
Member

@jimexist jimexist commented Jun 15, 2021

Which issue does this PR close?

Based on #558 so review that first.
Closes #565

Rationale for this change

What changes are included in this PR?

Benchmarking window_empty_over with aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, enable flat sampling, or reduce sample count to 50.
window_empty_over with aggregate functions
                        time:   [1.8621 ms 1.8736 ms 1.8860 ms]
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) high mild
  6 (6.00%) high severe

Benchmarking window_empty_over with built-in functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.5s, enable flat sampling, or reduce sample count to 50.
window_empty_over with built-in functions
                        time:   [1.4094 ms 1.4156 ms 1.4223 ms]
Found 15 outliers among 100 measurements (15.00%)
  7 (7.00%) high mild
  8 (8.00%) high severe

window_order_by with aggregate functions
                        time:   [9.0420 ms 9.0948 ms 9.1481 ms]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

window_order_by with built-in functions
                        time:   [8.1369 ms 8.1992 ms 8.2647 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

window_partition_by with aggregate functions
                        time:   [6.4131 ms 6.4554 ms 6.4980 ms]

window_partition_by with built-in functions
                        time:   [5.9691 ms 5.9986 ms 6.0299 ms]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

Benchmarking window_partition_by_order_by with aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, or reduce sample count to 80.
window_partition_by_order_by with aggregate functions
                        time:   [61.314 ms 61.728 ms 62.161 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

window_partition_by_order_by with built-in functions
                        time:   [17.538 ms 17.666 ms 17.861 ms]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

Are there any user-facing changes?

@codecov-commenter
Copy link

Codecov Report

Merging #564 (573d7b5) into master (e3e7e29) will decrease coverage by 0.03%.
The diff coverage is 81.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #564      +/-   ##
==========================================
- Coverage   76.12%   76.08%   -0.04%     
==========================================
  Files         156      156              
  Lines       27074    27121      +47     
==========================================
+ Hits        20609    20635      +26     
- Misses       6465     6486      +21     
Impacted Files Coverage Δ
datafusion/src/physical_plan/window_functions.rs 86.42% <ø> (+0.71%) ⬆️
datafusion/src/sql/planner.rs 84.75% <ø> (ø)
datafusion/src/physical_plan/planner.rs 79.84% <33.33%> (+2.30%) ⬆️
datafusion/src/physical_plan/mod.rs 80.00% <70.96%> (+0.90%) ⬆️
datafusion/src/physical_plan/windows.rs 82.59% <75.00%> (-3.88%) ⬇️
...afusion/src/physical_plan/expressions/nth_value.rs 79.41% <75.67%> (-11.07%) ⬇️
datafusion/src/execution/context.rs 92.13% <100.00%> (+0.13%) ⬆️
...fusion/src/physical_plan/expressions/row_number.rs 94.28% <100.00%> (+13.03%) ⬆️
datafusion/src/physical_plan/hash_aggregate.rs 86.54% <100.00%> (ø)
datafusion/src/scalar.rs 56.19% <100.00%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e3e7e29...573d7b5. Read the comment docs.

@jimexist
Copy link
Member Author

updated benchmark:

Benchmarking window empty over, aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.5s, enable flat sampling, or reduce sample count to 50.
window empty over, aggregate functions
                        time:   [1.8108 ms 1.8229 ms 1.8366 ms]
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

Benchmarking window empty over, built-in functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.1s, enable flat sampling, or reduce sample count to 50.
window empty over, built-in functions
                        time:   [1.3960 ms 1.4046 ms 1.4148 ms]
Found 11 outliers among 100 measurements (11.00%)
  6 (6.00%) high mild
  5 (5.00%) high severe

window order by, aggregate functions
                        time:   [8.4081 ms 8.5143 ms 8.6268 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

window order by, built-in functions
                        time:   [7.4327 ms 7.4752 ms 7.5203 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

window partition by, u64_wide, aggregate functions
                        time:   [17.877 ms 17.982 ms 18.094 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

window partition by, u64_narrow, aggregate functions
                        time:   [6.4812 ms 6.5318 ms 6.5843 ms]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

window partition by, u64_wide, built-in functions
                        time:   [16.144 ms 16.248 ms 16.360 ms]
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

window partition by, u64_narrow, built-in functions
                        time:   [5.8226 ms 5.8595 ms 5.8988 ms]
Found 11 outliers among 100 measurements (11.00%)
  11 (11.00%) high mild

window partition and order by, u64_wide, aggregate functions
                        time:   [25.350 ms 25.578 ms 25.824 ms]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

Benchmarking window partition and order by, u64_narrow, aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.4s, or reduce sample count to 90.
window partition and order by, u64_narrow, aggregate functions
                        time:   [53.310 ms 53.726 ms 54.169 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

window partition and order by, u64_wide, built-in functions
                        time:   [23.481 ms 23.690 ms 23.917 ms]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

window partition and order by, u64_narrow, built-in functions
                        time:   [15.603 ms 15.715 ms 15.839 ms]
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

@jimexist
Copy link
Member Author

@Dandandan @alamb this benchmark is now updated and rebased

@jimexist
Copy link
Member Author

updated to 1million samples, 8k batch size:

window empty over, aggregate functions
                        time:   [32.030 ms 32.251 ms 32.498 ms]
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  4 (4.00%) high severe

window empty over, built-in functions
                        time:   [25.202 ms 25.453 ms 25.716 ms]
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe

Benchmarking window order by, aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 17.1s, or reduce sample count to 20.
window order by, aggregate functions
                        time:   [168.14 ms 169.41 ms 170.74 ms]
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

Benchmarking window order by, built-in functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 15.5s, or reduce sample count to 30.
window order by, built-in functions
                        time:   [153.69 ms 155.38 ms 157.13 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

Benchmarking window partition by, u64_wide, aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 25.5s, or reduce sample count to 10.
window partition by, u64_wide, aggregate functions
                        time:   [249.29 ms 251.94 ms 254.74 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking window partition by, u64_narrow, aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.8s, or reduce sample count to 30.
window partition by, u64_narrow, aggregate functions
                        time:   [136.66 ms 138.32 ms 140.00 ms]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild

Benchmarking window partition by, u64_wide, built-in functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 23.0s, or reduce sample count to 20.
window partition by, u64_wide, built-in functions
                        time:   [212.94 ms 214.92 ms 217.02 ms]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

Benchmarking window partition by, u64_narrow, built-in functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.0s, or reduce sample count to 40.
window partition by, u64_narrow, built-in functions
                        time:   [114.67 ms 115.70 ms 116.78 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Benchmarking window partition and order by, u64_wide, aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 42.3s, or reduce sample count to 10.
window partition and order by, u64_wide, aggregate functions
                        time:   [424.65 ms 427.99 ms 431.40 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking window partition and order by, u64_narrow, aggregate functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 62.7s, or reduce sample count to 10.
window partition and order by, u64_narrow, aggregate functions
                        time:   [603.14 ms 607.66 ms 612.26 ms]

Benchmarking window partition and order by, u64_wide, built-in functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 38.5s, or reduce sample count to 10.
window partition and order by, u64_wide, built-in functions
                        time:   [381.45 ms 384.54 ms 387.73 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Benchmarking window partition and order by, u64_narrow, built-in functions: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 39.2s, or reduce sample count to 10.
window partition and order by, u64_narrow, built-in functions
                        time:   [382.72 ms 385.97 ms 389.31 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @jimexist

@Dandandan Dandandan merged commit 0bf1c09 into apache:master Jun 22, 2021
@jimexist jimexist deleted the add-window-benches branch June 22, 2021 10:10
@houqp houqp added the datafusion Changes in the datafusion crate label Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add meaningful and realistic benchmark suites for window functions
4 participants