Round robin polling between tied winners in sort preserving merge #13133

jayzhan211 · 2024-10-27T00:39:27Z

Which issue does this PR close?

Rationale for this change

To address the issue of unbalanced polling between partitions due to tie-breakers being based on partition index, especially in cases of low cardinality, we are making changes to the winner selection mechanism. Previously, partitions with smaller indices were consistently chosen as the winners, leading to an uneven distribution of polling. This caused upstream operator buffers for the other partitions to grow excessively, as they continued receiving data without consuming it.

For example, an upstream operator like a repartition execution would keep sending data to certain partitions, but those partitions wouldn't consume the data if they weren't selected as winners. This resulted in inefficient buffer usage.

To resolve this, we are modifying the tie-breaking logic. Instead of always choosing the partition with the smallest index, we now select the partition that has the fewest poll counts for the same value. This ensures that multiple partitions with the same value are chosen equally, distributing the polling load in a round-robin fashion. This approach balances the workload more effectively across partitions and avoids excessive buffer growth.

What changes are included in this PR?

Are these changes tested?

Test in physical-plan/src/sorts/sort_preserving_merge.rs

test_round_robin_tie_breaker_success and test_round_robin_tie_breaker_fail ensure the change reduces the memory usage

Benmark

Existing benchmark

Comparing main and rrt-spm-upstream
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ rrt-spm-upstream ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     0.52ms │           0.43ms │ +1.21x faster │
│ QQuery 1     │    46.47ms │          49.79ms │  1.07x slower │
│ QQuery 2     │    78.22ms │          86.23ms │  1.10x slower │
│ QQuery 3     │    66.38ms │          64.82ms │     no change │
│ QQuery 4     │   432.66ms │         388.82ms │ +1.11x faster │
│ QQuery 5     │   707.97ms │         656.67ms │ +1.08x faster │
│ QQuery 6     │    42.25ms │          38.61ms │ +1.09x faster │
│ QQuery 7     │    48.76ms │          42.13ms │ +1.16x faster │
│ QQuery 8     │   641.11ms │         626.13ms │     no change │
│ QQuery 9     │   715.40ms │         667.14ms │ +1.07x faster │
│ QQuery 10    │   200.67ms │         197.61ms │     no change │
│ QQuery 11    │   224.13ms │         216.96ms │     no change │
│ QQuery 12    │   765.01ms │         710.16ms │ +1.08x faster │
│ QQuery 13    │   918.07ms │         896.05ms │     no change │
│ QQuery 14    │   878.00ms │         829.02ms │ +1.06x faster │
│ QQuery 15    │   539.06ms │         480.04ms │ +1.12x faster │
│ QQuery 16    │  1350.69ms │        1279.84ms │ +1.06x faster │
│ QQuery 17    │  1213.09ms │        1179.69ms │     no change │
│ QQuery 18    │  3323.95ms │        3398.52ms │     no change │
│ QQuery 19    │    60.29ms │          55.27ms │ +1.09x faster │
│ QQuery 20    │   925.11ms │         903.01ms │     no change │
│ QQuery 21    │  1234.20ms │        1211.11ms │     no change │
│ QQuery 22    │  3209.06ms │        3193.20ms │     no change │
│ QQuery 23    │  8038.01ms │        7709.54ms │     no change │
│ QQuery 24    │   475.28ms │         482.18ms │     no change │
│ QQuery 25    │   484.49ms │         476.59ms │     no change │
│ QQuery 26    │   551.48ms │         532.30ms │     no change │
│ QQuery 27    │  1351.36ms │        1319.88ms │     no change │
│ QQuery 28    │ 10177.53ms │        9907.73ms │     no change │
│ QQuery 29    │   384.91ms │         392.41ms │     no change │
│ QQuery 30    │   737.95ms │         705.79ms │     no change │
│ QQuery 31    │   666.20ms │         668.27ms │     no change │
│ QQuery 32    │  3339.67ms │        3162.54ms │ +1.06x faster │
│ QQuery 33    │  4883.20ms │        4816.38ms │     no change │
│ QQuery 34    │  4190.99ms │        3504.05ms │ +1.20x faster │
│ QQuery 35    │  1095.79ms │         983.92ms │ +1.11x faster │
│ QQuery 36    │   149.27ms │         144.32ms │     no change │
│ QQuery 37    │   107.31ms │         100.31ms │ +1.07x faster │
│ QQuery 38    │   110.00ms │         107.75ms │     no change │
│ QQuery 39    │   317.82ms │         312.96ms │     no change │
│ QQuery 40    │    35.34ms │          37.91ms │  1.07x slower │
│ QQuery 41    │    33.35ms │          32.44ms │     no change │
│ QQuery 42    │    39.62ms │          38.48ms │     no change │
└──────────────┴────────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)               │ 54790.60ms │
│ Total Time (rrt-spm-upstream)   │ 52607.01ms │
│ Average Time (main)             │  1274.20ms │
│ Average Time (rrt-spm-upstream) │  1223.42ms │
│ Queries Faster                  │         15 │
│ Queries Slower                  │          3 │
│ Queries with No Change          │         25 │
└─────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ rrt-spm-upstream ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    1.40ms │           1.27ms │ +1.10x faster │
│ QQuery 1     │   33.72ms │          30.21ms │ +1.12x faster │
│ QQuery 2     │   73.83ms │          75.51ms │     no change │
│ QQuery 3     │   55.09ms │          59.58ms │  1.08x slower │
│ QQuery 4     │  393.95ms │         379.21ms │     no change │
│ QQuery 5     │  445.54ms │         454.98ms │     no change │
│ QQuery 6     │   27.19ms │          26.65ms │     no change │
│ QQuery 7     │   31.08ms │          30.76ms │     no change │
│ QQuery 8     │  620.59ms │         623.24ms │     no change │
│ QQuery 9     │  660.40ms │         647.26ms │     no change │
│ QQuery 10    │  174.50ms │         175.96ms │     no change │
│ QQuery 11    │  204.17ms │         208.39ms │     no change │
│ QQuery 12    │  503.89ms │         498.32ms │     no change │
│ QQuery 13    │  761.24ms │         753.65ms │     no change │
│ QQuery 14    │  606.85ms │         610.73ms │     no change │
│ QQuery 15    │  464.33ms │         459.33ms │     no change │
│ QQuery 16    │ 1198.99ms │        1229.86ms │     no change │
│ QQuery 17    │ 1095.33ms │        1137.24ms │     no change │
│ QQuery 18    │ 3024.95ms │        3283.14ms │  1.09x slower │
│ QQuery 19    │   42.05ms │          45.01ms │  1.07x slower │
│ QQuery 20    │ 1071.77ms │        1081.38ms │     no change │
│ QQuery 21    │ 1257.39ms │        1257.81ms │     no change │
│ QQuery 22    │ 3496.73ms │        3232.40ms │ +1.08x faster │
│ QQuery 23    │ 6624.41ms │        6532.45ms │     no change │
│ QQuery 24    │  343.09ms │         359.05ms │     no change │
│ QQuery 25    │  340.91ms │         310.57ms │ +1.10x faster │
│ QQuery 26    │  403.59ms │         383.72ms │     no change │
│ QQuery 27    │ 1518.98ms │        1400.43ms │ +1.08x faster │
│ QQuery 28    │ 9779.07ms │        9320.41ms │     no change │
│ QQuery 29    │  381.86ms │         427.38ms │  1.12x slower │
│ QQuery 30    │  641.28ms │         634.10ms │     no change │
│ QQuery 31    │  573.84ms │         567.99ms │     no change │
│ QQuery 32    │ 3489.73ms │        3112.20ms │ +1.12x faster │
│ QQuery 33    │ 5149.93ms │        3671.92ms │ +1.40x faster │
│ QQuery 34    │ 4633.02ms │        4101.02ms │ +1.13x faster │
│ QQuery 35    │ 1077.79ms │        1066.61ms │     no change │
│ QQuery 36    │  142.61ms │         129.47ms │ +1.10x faster │
│ QQuery 37    │   59.04ms │          58.39ms │     no change │
│ QQuery 38    │   89.33ms │          89.34ms │     no change │
│ QQuery 39    │  300.53ms │         295.00ms │     no change │
│ QQuery 40    │   27.87ms │          27.96ms │     no change │
│ QQuery 41    │   24.07ms │          24.34ms │     no change │
│ QQuery 42    │   32.99ms │          31.84ms │     no change │
└──────────────┴───────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)               │ 51878.95ms │
│ Total Time (rrt-spm-upstream)   │ 48846.07ms │
│ Average Time (main)             │  1206.49ms │
│ Average Time (rrt-spm-upstream) │  1135.96ms │
│ Queries Faster                  │          9 │
│ Queries Slower                  │          4 │
│ Queries with No Change          │         30 │
└─────────────────────────────────┴────────────┘
Note: Skipping /Users/bytedance/arrow-datafusion/benchmarks/results/main/imdb.json as /Users/bytedance/arrow-datafusion/benchmarks/results/rrt-spm-upstream/imdb.json does not exist
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ rrt-spm-upstream ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  67.97ms │          69.60ms │     no change │
│ QQuery 2     │  13.89ms │          14.24ms │     no change │
│ QQuery 3     │  21.74ms │          21.15ms │     no change │
│ QQuery 4     │  16.72ms │          15.28ms │ +1.09x faster │
│ QQuery 5     │  36.64ms │          32.53ms │ +1.13x faster │
│ QQuery 6     │   5.30ms │           4.46ms │ +1.19x faster │
│ QQuery 7     │  69.24ms │          56.51ms │ +1.23x faster │
│ QQuery 8     │  16.79ms │          15.79ms │ +1.06x faster │
│ QQuery 9     │  35.87ms │          33.51ms │ +1.07x faster │
│ QQuery 10    │  38.64ms │          37.40ms │     no change │
│ QQuery 11    │   5.92ms │           5.45ms │ +1.09x faster │
│ QQuery 12    │  21.70ms │          22.56ms │     no change │
│ QQuery 13    │  15.34ms │          14.99ms │     no change │
│ QQuery 14    │   7.61ms │           7.07ms │ +1.08x faster │
│ QQuery 15    │  11.56ms │          11.09ms │     no change │
│ QQuery 16    │  14.41ms │          13.88ms │     no change │
│ QQuery 17    │  48.13ms │          44.84ms │ +1.07x faster │
│ QQuery 18    │ 116.31ms │         102.80ms │ +1.13x faster │
│ QQuery 19    │  19.98ms │          18.92ms │ +1.06x faster │
│ QQuery 20    │  21.47ms │          20.01ms │ +1.07x faster │
│ QQuery 21    │  80.11ms │          71.54ms │ +1.12x faster │
│ QQuery 22    │  14.28ms │          13.46ms │ +1.06x faster │
└──────────────┴──────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary               ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (main)               │ 699.60ms │
│ Total Time (rrt-spm-upstream)   │ 647.07ms │
│ Average Time (main)             │  31.80ms │
│ Average Time (rrt-spm-upstream) │  29.41ms │
│ Queries Faster                  │       14 │
│ Queries Slower                  │        0 │
│ Queries with No Change          │        8 │
└─────────────────────────────────┴──────────┘
Note: Skipping /Users/bytedance/arrow-datafusion/benchmarks/results/main/tpch_mem_sf10.json as /Users/bytedance/arrow-datafusion/benchmarks/results/rrt-spm-upstream/tpch_mem_sf10.json does not exist
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ rrt-spm-upstream ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  95.44ms │         109.33ms │  1.15x slower │
│ QQuery 2     │  27.33ms │          26.26ms │     no change │
│ QQuery 3     │  42.81ms │          43.71ms │     no change │
│ QQuery 4     │  27.25ms │          26.30ms │     no change │
│ QQuery 5     │  57.48ms │          55.18ms │     no change │
│ QQuery 6     │  18.26ms │          18.05ms │     no change │
│ QQuery 7     │  72.86ms │          69.96ms │     no change │
│ QQuery 8     │  56.06ms │          54.84ms │     no change │
│ QQuery 9     │  70.99ms │          66.08ms │ +1.07x faster │
│ QQuery 10    │  69.62ms │          63.32ms │ +1.10x faster │
│ QQuery 11    │  20.89ms │          19.07ms │ +1.10x faster │
│ QQuery 12    │  52.16ms │          45.78ms │ +1.14x faster │
│ QQuery 13    │  41.33ms │          34.95ms │ +1.18x faster │
│ QQuery 14    │  29.83ms │          29.15ms │     no change │
│ QQuery 15    │  48.01ms │          44.87ms │ +1.07x faster │
│ QQuery 16    │  21.04ms │          19.35ms │ +1.09x faster │
│ QQuery 17    │  82.28ms │          72.71ms │ +1.13x faster │
│ QQuery 18    │ 118.77ms │         104.85ms │ +1.13x faster │
│ QQuery 19    │  55.26ms │          50.51ms │ +1.09x faster │
│ QQuery 20    │  42.10ms │          41.71ms │     no change │
│ QQuery 21    │  88.76ms │          84.89ms │     no change │
│ QQuery 22    │  17.43ms │          17.78ms │     no change │
└──────────────┴──────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary               ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)               │ 1155.97ms │
│ Total Time (rrt-spm-upstream)   │ 1098.66ms │
│ Average Time (main)             │   52.54ms │
│ Average Time (rrt-spm-upstream) │   49.94ms │
│ Queries Faster                  │        10 │
│ Queries Slower                  │         1 │
│ Queries with No Change          │        11 │
└─────────────────────────────────┴───────────┘
--------------------
Benchmark tpch_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ rrt-spm-upstream ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  720.38ms │         713.40ms │     no change │
│ QQuery 2     │  143.36ms │         123.34ms │ +1.16x faster │
│ QQuery 3     │  391.19ms │         360.93ms │ +1.08x faster │
│ QQuery 4     │  201.77ms │         202.13ms │     no change │
│ QQuery 5     │  559.81ms │         538.89ms │     no change │
│ QQuery 6     │  135.95ms │         124.48ms │ +1.09x faster │
│ QQuery 7     │  808.27ms │         770.86ms │     no change │
│ QQuery 8     │  631.28ms │         580.88ms │ +1.09x faster │
│ QQuery 9     │ 1062.98ms │         941.08ms │ +1.13x faster │
│ QQuery 10    │  625.48ms │         604.40ms │     no change │
│ QQuery 11    │   85.16ms │          88.97ms │     no change │
│ QQuery 12    │  361.73ms │         325.75ms │ +1.11x faster │
│ QQuery 13    │  428.71ms │         398.42ms │ +1.08x faster │
│ QQuery 14    │  244.17ms │         220.02ms │ +1.11x faster │
│ QQuery 15    │  341.46ms │         328.12ms │     no change │
│ QQuery 16    │  108.79ms │         103.85ms │     no change │
│ QQuery 17    │  989.85ms │         884.71ms │ +1.12x faster │
│ QQuery 18    │ 1502.95ms │        1290.73ms │ +1.16x faster │
│ QQuery 19    │  466.98ms │         412.42ms │ +1.13x faster │
│ QQuery 20    │  395.18ms │         360.38ms │ +1.10x faster │
│ QQuery 21    │ 1046.22ms │         999.99ms │     no change │
│ QQuery 22    │  122.09ms │         118.41ms │     no change │
└──────────────┴───────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)               │ 11373.77ms │
│ Total Time (rrt-spm-upstream)   │ 10492.17ms │
│ Average Time (main)             │   516.99ms │
│ Average Time (rrt-spm-upstream) │   476.92ms │
│ Queries Faster                  │         12 │
│ Queries Slower                  │          0 │
│ Queries with No Change          │         10 │
└─────────────────────────────────┴────────────┘

SPM benchmark

low_card_without_tiebreaker_batch_count_1_partition_count_2
                        time:   [13.366 µs 13.434 µs 13.509 µs]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

low_card_without_tiebreaker_batch_count_1_partition_count_8
                        time:   [61.584 µs 61.669 µs 61.766 µs]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

low_card_without_tiebreaker_batch_count_1_partition_count_32
                        time:   [304.05 µs 308.92 µs 314.73 µs]
Found 16 outliers among 100 measurements (16.00%)
  6 (6.00%) high mild
  10 (10.00%) high severe

low_card_without_tiebreaker_batch_count_25_partition_count_2
                        time:   [312.28 µs 319.79 µs 329.16 µs]
Found 16 outliers among 100 measurements (16.00%)
  7 (7.00%) high mild
  9 (9.00%) high severe

low_card_without_tiebreaker_batch_count_25_partition_count_8
                        time:   [1.5046 ms 1.5100 ms 1.5162 ms]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

low_card_without_tiebreaker_batch_count_25_partition_count_32
                        time:   [7.4279 ms 7.4460 ms 7.4651 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

low_card_without_tiebreaker_batch_count_625_partition_count_2
                        time:   [7.8036 ms 7.8623 ms 7.9302 ms]
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

low_card_without_tiebreaker_batch_count_625_partition_count_8
                        time:   [39.093 ms 39.248 ms 39.429 ms]
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

low_card_without_tiebreaker_batch_count_625_partition_count_32
                        time:   [191.00 ms 191.44 ms 191.92 ms]
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

low_card_with_tiebreaker_batch_count_1_partition_count_2
                        time:   [16.404 µs 16.490 µs 16.593 µs]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

low_card_with_tiebreaker_batch_count_1_partition_count_8
                        time:   [68.279 µs 68.755 µs 69.363 µs]
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe

low_card_with_tiebreaker_batch_count_1_partition_count_32
                        time:   [330.79 µs 331.68 µs 332.60 µs]
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  8 (8.00%) high severe

low_card_with_tiebreaker_batch_count_25_partition_count_2
                        time:   [387.35 µs 391.46 µs 397.10 µs]
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

low_card_with_tiebreaker_batch_count_25_partition_count_8
                        time:   [1.6955 ms 1.7088 ms 1.7241 ms]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

low_card_with_tiebreaker_batch_count_25_partition_count_32
                        time:   [7.9670 ms 7.9976 ms 8.0309 ms]
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

low_card_with_tiebreaker_batch_count_625_partition_count_2
                        time:   [9.6862 ms 9.7693 ms 9.8785 ms]
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

low_card_with_tiebreaker_batch_count_625_partition_count_8
                        time:   [43.646 ms 44.099 ms 44.713 ms]
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

low_card_with_tiebreaker_batch_count_625_partition_count_32
                        time:   [203.61 ms 204.11 ms 204.66 ms]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

high_card_without_tiebreaker_batch_count_1_partition_count_2
                        time:   [14.491 µs 14.522 µs 14.559 µs]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

high_card_without_tiebreaker_batch_count_1_partition_count_8
                        time:   [70.368 µs 71.419 µs 72.912 µs]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

high_card_without_tiebreaker_batch_count_1_partition_count_32
                        time:   [385.86 µs 390.40 µs 395.69 µs]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

high_card_without_tiebreaker_batch_count_25_partition_count_2
                        time:   [319.50 µs 324.18 µs 330.14 µs]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

high_card_without_tiebreaker_batch_count_25_partition_count_8
                        time:   [1.5848 ms 1.6063 ms 1.6340 ms]
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

high_card_without_tiebreaker_batch_count_25_partition_count_32
                        time:   [7.6091 ms 7.6281 ms 7.6474 ms]

high_card_without_tiebreaker_batch_count_625_partition_count_2
                        time:   [7.8836 ms 7.9195 ms 7.9607 ms]
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

high_card_without_tiebreaker_batch_count_625_partition_count_8
                        time:   [39.740 ms 39.938 ms 40.212 ms]
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

high_card_without_tiebreaker_batch_count_625_partition_count_32
                        time:   [193.04 ms 193.58 ms 194.20 ms]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

high_card_with_tiebreaker_batch_count_1_partition_count_2
                        time:   [16.543 µs 16.740 µs 17.011 µs]
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe

high_card_with_tiebreaker_batch_count_1_partition_count_8
                        time:   [76.672 µs 77.095 µs 77.590 µs]
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe

high_card_with_tiebreaker_batch_count_1_partition_count_32
                        time:   [390.44 µs 391.35 µs 392.40 µs]
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild

high_card_with_tiebreaker_batch_count_25_partition_count_2
                        time:   [358.82 µs 369.85 µs 382.69 µs]
Found 13 outliers among 100 measurements (13.00%)
  1 (1.00%) high mild
  12 (12.00%) high severe

high_card_with_tiebreaker_batch_count_25_partition_count_8
                        time:   [1.6513 ms 1.6563 ms 1.6626 ms]
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

high_card_with_tiebreaker_batch_count_25_partition_count_32
                        time:   [7.9372 ms 7.9543 ms 7.9725 ms]
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild

high_card_with_tiebreaker_batch_count_625_partition_count_2
                        time:   [8.5996 ms 8.6271 ms 8.6556 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

high_card_with_tiebreaker_batch_count_625_partition_count_8
                        time:   [43.480 ms 44.112 ms 44.990 ms]
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

high_card_with_tiebreaker_batch_count_625_partition_count_32
                        time:   [202.68 ms 203.14 ms 203.64 ms]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

Summary

Performance

I run the benchmark couple of times since it varies each time, I find there is clear improvement for some query like clickbench q32 & q34. I think it only shows difference in larger dataset and extreme cases.

Memory usage

The memory issue is what we focus on and we also observe less memory usage. 👍

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

…spm-upstream

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

jayzhan211 · 2024-10-30T00:10:51Z

If this PR turns disables stable sort, maybe we can turn this into an optional behavior defined by configuration (so users can choose between the stable sort and better performance/memory use) 🤔

I revert the configuration change, so now if you prefer the old behaviour you can set to false with with_round_robin_repartition. I think tie breaker is much more "stable" so it is enabled by default

2010YOUY01 · 2024-10-30T06:24:58Z

datafusion/physical-plan/src/sorts/sort_preserving_merge.rs

    use tokio::time::timeout;

+    fn generate_task_ctx_for_round_robin_tie_breaker() -> Result<Arc<TaskContext>> {
+        let mut pool_per_consumer = HashMap::new();


(Previous conversation for the context https://github.com/apache/datafusion/pull/13133/files#r1818369721)

I tried to remove the per-operator memory limit, and set the total limit to 20M, then rerun two unit tests in this file.

let mut pool_per_consumer = HashMap::new(); // Bytes from 660_000 to 30_000_000 (or even more) are all valid limits // pool_per_consumer.insert("RepartitionExec[0]".to_string(), 10_000_000); // pool_per_consumer.insert("RepartitionExec[1]".to_string(), 10_000_000); let runtime = RuntimeEnvBuilder::new() // Random large number for total mem limit, we only care about RepartitionExec only .with_memory_limit_per_consumer(20_000_000, 1.0, pool_per_consumer) .build_arc()?;

test_round_robin_tie_breaker_fail and test_round_robin_tie_breaker_success all passed, the error message for fail case is: Expected error: ResourcesExhausted("Additional allocation failed with top memory consumers (across reservations) as: RepartitionExec[1] consumed 18206496 bytes, SortPreservingMergeExec[0] consumed 1446684 bytes, RepartitionExec[0] consumed 216744 bytes. Error: Failed to allocate additional 216744 bytes for SortPreservingMergeExec[0] with 433488 bytes already allocated for this reservation - 130076 bytes remain available for the total pool")
I think this error message is expected: SPM only reserved constant memory, and RepartitionExec's memory consumption indeed keeps growing.
If that's the case memory pool related changes are not necessary any more, I'd prefer to remove them from this PR.
I think setting per-consumer memory limits by name makes the API less user-friendly and increases the risk of implementation issues with future changes. Perhaps we can explore a better solution later if a more suitable use case arises.

Looks great, thank you. I found it quite hard to find the number so I ends up to measure specific Exec

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

alamb · 2024-10-30T11:07:36Z

If the definition of stable merge is ALWAYS get the first partition regardless of the number (it takes 3 if there are 3 in stream 1) then this PR doesn't follow the definition.

Makes sense -- I will make a PR to clarify the behavior

But I believe this change is much more "stable", each stream is polled evenly, so I would like to set this as default.

I agree it makes sense to use this as the default policy

jayzhan211 · 2024-10-30T11:44:46Z

It seems most of the issue are addressed. I plan to merge this today 🚀

berkaysynnada · 2024-10-30T12:01:20Z

I am taking a final look now

berkaysynnada

I think we are all agreed on the final state—thanks again @jayzhan211! I have just one minor issue with a comment and a potential improvement in the test.

I was considering whether this commit might bring us too close to run on the edge points of the test, but I concluded it shouldn’t be a problem, as our test runs at both ends of the polling spectrum.

cc @ozankabak

berkaysynnada · 2024-10-30T13:37:42Z

datafusion/physical-plan/src/sorts/merge.rs

+            // If the winner doesn't survive in the final match, it means the value has changed.
+            // The polls count are outdated (because the value advanced) but not yet cleaned-up at this point.
+            // Given the value is equal, we choose the smaller index if the value is the same.
+            self.update_winner(cmp_node, winner, challenger);


@jayzhan211 This comment confused me. This line is executed when the challenger index is smaller and poll counts are equal. Does it also pass through here when winner does not survive because its value has changed?

This line is executed when the challenger index is smaller and poll counts are equal

Poll counts may not be the same, when we have new winner, it should have a different value than the previous tie breaker round. If the value is the same, we will find that the winner survives at the final match, then we need to check the poll counts to select the one.

When we reach the code here, the new winner has the same value with the challenger, but it has different value than the original winner (self.loser_tree[0]). In this case, we just need to compare with the index since
this should be the new round of the tie breaker, polls count doesn't change the result.

berkaysynnada · 2024-10-30T13:46:43Z

datafusion/physical-plan/benches/spm.rs

+use criterion::async_executor::FuturesExecutor;
+use criterion::{black_box, criterion_group, criterion_main, Criterion};
+
+fn generate_spm_for_round_robin_tie_breaker(


I have suggested to @jayzhan211 to take the first steps in creating operator-specific benchmarks. I believe there's already a goal for this (I recall an older issue related to it). Perhaps we should extract these benchmarks from core and port them here @alamb ?

berkaysynnada · 2024-10-30T13:50:59Z

datafusion/physical-plan/src/sorts/sort_preserving_merge.rs

+    }
+    fn generate_spm_for_round_robin_tie_breaker() -> Result<Arc<SortPreservingMergeExec>>
+    {
+        let target_batch_size = 12500;


These numbers and the memory limit in these tests are actually correlated constants and can’t be adjusted independently. Could we encapsulate or link them somehow to emphasize this dependency @jayzhan211 ?

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

alamb · 2024-10-30T16:16:50Z

I ran the benchmarks and have basically the same results

--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃  main_base ┃ rrt-spm-upstream ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.41ms │           2.28ms │ +1.06x faster │
│ QQuery 1     │    38.95ms │          42.24ms │  1.08x slower │
│ QQuery 2     │    96.08ms │          97.39ms │     no change │
│ QQuery 3     │   105.04ms │         104.65ms │     no change │
│ QQuery 4     │   894.57ms │         893.18ms │     no change │
│ QQuery 5     │   948.09ms │         935.75ms │     no change │
│ QQuery 6     │    38.00ms │          37.37ms │     no change │
│ QQuery 7     │    42.97ms │          46.96ms │  1.09x slower │
│ QQuery 8     │  1333.55ms │        1346.14ms │     no change │
│ QQuery 9     │  1302.16ms │        1308.23ms │     no change │
│ QQuery 10    │   425.87ms │         424.38ms │     no change │
│ QQuery 11    │   476.02ms │         477.82ms │     no change │
│ QQuery 12    │  1097.99ms │        1098.23ms │     no change │
│ QQuery 13    │  1552.98ms │        1581.55ms │     no change │
│ QQuery 14    │  1199.71ms │        1245.59ms │     no change │
│ QQuery 15    │  1062.34ms │        1070.81ms │     no change │
│ QQuery 16    │  2405.72ms │        2469.86ms │     no change │
│ QQuery 17    │  2272.56ms │        2295.08ms │     no change │
│ QQuery 18    │  4854.53ms │        4897.47ms │     no change │
│ QQuery 19    │    98.10ms │          96.60ms │     no change │
│ QQuery 20    │  1463.17ms │        1481.13ms │     no change │
│ QQuery 21    │  1768.26ms │        1804.28ms │     no change │
│ QQuery 22    │  3079.04ms │        3097.65ms │     no change │
│ QQuery 23    │ 10084.90ms │       10113.16ms │     no change │
│ QQuery 24    │   645.97ms │         646.15ms │     no change │
│ QQuery 25    │   520.81ms │         520.45ms │     no change │
│ QQuery 26    │   713.93ms │         720.45ms │     no change │
│ QQuery 27    │  2224.02ms │        2187.68ms │     no change │
│ QQuery 28    │ 14268.98ms │       14476.15ms │     no change │
│ QQuery 29    │   537.17ms │         541.06ms │     no change │
│ QQuery 30    │  1096.23ms │        1082.53ms │     no change │
│ QQuery 31    │  1156.69ms │        1125.99ms │     no change │
│ QQuery 32    │  4146.83ms │        4105.25ms │     no change │
│ QQuery 33    │  5051.50ms │        5041.39ms │     no change │
│ QQuery 34    │  5033.71ms │        4952.77ms │     no change │
│ QQuery 35    │  1839.17ms │        1881.73ms │     no change │
│ QQuery 36    │   264.60ms │         266.19ms │     no change │
│ QQuery 37    │   126.05ms │         126.40ms │     no change │
│ QQuery 38    │   146.96ms │         138.99ms │ +1.06x faster │
│ QQuery 39    │   735.03ms │         768.46ms │     no change │
│ QQuery 40    │    55.08ms │          61.00ms │  1.11x slower │
│ QQuery 41    │    51.00ms │          47.89ms │ +1.06x faster │
│ QQuery 42    │    64.71ms │          64.28ms │     no change │
└──────────────┴────────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main_base)          │ 75321.43ms │
│ Total Time (rrt-spm-upstream)   │ 75722.58ms │
│ Average Time (main_base)        │  1751.66ms │
│ Average Time (rrt-spm-upstream) │  1760.99ms │
│ Queries Faster                  │          3 │
│ Queries Slower                  │          3 │
│ Queries with No Change          │         37 │
└─────────────────────────────────┴────────────┘

--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃ main_base ┃ rrt-spm-upstream ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  206.89ms │         212.38ms │     no change │
│ QQuery 2     │  119.47ms │         122.53ms │     no change │
│ QQuery 3     │  122.07ms │         122.63ms │     no change │
│ QQuery 4     │   87.31ms │          91.79ms │  1.05x slower │
│ QQuery 5     │  153.79ms │         160.10ms │     no change │
│ QQuery 6     │   55.37ms │          43.20ms │ +1.28x faster │
│ QQuery 7     │  212.52ms │         196.11ms │ +1.08x faster │
│ QQuery 8     │  157.39ms │         161.87ms │     no change │
│ QQuery 9     │  238.97ms │         235.64ms │     no change │
│ QQuery 10    │  228.70ms │         222.31ms │     no change │
│ QQuery 11    │   91.86ms │          91.11ms │     no change │
│ QQuery 12    │  131.02ms │         131.64ms │     no change │
│ QQuery 13    │  214.07ms │         217.36ms │     no change │
│ QQuery 14    │   91.43ms │          76.68ms │ +1.19x faster │
│ QQuery 15    │  113.76ms │         110.83ms │     no change │
│ QQuery 16    │   78.80ms │          78.34ms │     no change │
│ QQuery 17    │  219.77ms │         208.20ms │ +1.06x faster │
│ QQuery 18    │  324.28ms │         314.26ms │     no change │
│ QQuery 19    │  140.83ms │         129.11ms │ +1.09x faster │
│ QQuery 20    │  127.22ms │         134.17ms │  1.05x slower │
│ QQuery 21    │  280.80ms │         260.57ms │ +1.08x faster │
│ QQuery 22    │   67.67ms │          66.44ms │     no change │
└──────────────┴───────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary               ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main_base)          │ 3464.00ms │
│ Total Time (rrt-spm-upstream)   │ 3387.29ms │
│ Average Time (main_base)        │  157.45ms │
│ Average Time (rrt-spm-upstream) │  153.97ms │
│ Queries Faster                  │         6 │
│ Queries Slower                  │         2 │
│ Queries with No Change          │        14 │
└─────────────────────────────────┴───────────┘

alamb · 2024-10-30T16:18:34Z

I also verified that some of these queries that got faster actually use SortPreservingMerge which they do:

andrewlamb@Andrews-MacBook-Pro-2:~/Downloads$ datafusion-cli -f q38.sql
DataFusion CLI v42.1.0
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Limit: skip=1000, fetch=10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|               |   Sort: pageviews DESC NULLS FIRST, fetch=1010                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|               |     Projection: hits.URL, count(*) AS pageviews                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|               |       Aggregate: groupBy=[[hits.URL]], aggr=[[count(Int64(1)) AS count(*)]]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|               |         Projection: hits.URL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|               |           Filter: hits.CounterID = Int32(62) AND CAST(CAST(hits.EventDate AS Int32) AS Date32) >= Date32("2013-07-01") AND CAST(CAST(hits.EventDate AS Int32) AS Date32) <= Date32("2013-07-31") AND hits.IsRefresh = Int16(0) AND hits.IsLink != Int16(0) AND hits.IsDownload = Int16(0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|               |             TableScan: hits projection=[EventDate, CounterID, URL, IsRefresh, IsLink, IsDownload], partial_filters=[hits.CounterID = Int32(62), CAST(CAST(hits.EventDate AS Int32) AS Date32) >= Date32("2013-07-01"), CAST(CAST(hits.EventDate AS Int32) AS Date32) <= Date32("2013-07-31"), hits.IsRefresh = Int16(0), hits.IsLink != Int16(0), hits.IsDownload = Int16(0)]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| physical_plan | GlobalLimitExec: skip=1000, fetch=10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|               |   SortPreservingMergeExec: [pageviews@1 DESC], fetch=1010                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|               |     SortExec: TopK(fetch=1010), expr=[pageviews@1 DESC], preserve_partitioning=[true]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|               |       ProjectionExec: expr=[URL@0 as URL, count(*)@1 as pageviews]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|               |         AggregateExec: mode=FinalPartitioned, gby=[URL@0 as URL], aggr=[count(*)]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|               |           CoalesceBatchesExec: target_batch_size=8192                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|               |             RepartitionExec: partitioning=Hash([URL@0], 16), input_partitions=16                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|               |               AggregateExec: mode=Partial, gby=[URL@0 as URL], aggr=[count(*)]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|               |                 CoalesceBatchesExec: target_batch_size=8192                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|               |                   FilterExec: CounterID@1 = 62 AND CAST(CAST(EventDate@0 AS Int32) AS Date32) >= 2013-07-01 AND CAST(CAST(EventDate@0 AS Int32) AS Date32) <= 2013-07-31 AND IsRefresh@3 = 0 AND IsLink@4 != 0 AND IsDownload@5 = 0, projection=[URL@2]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|               |                     ParquetExec: file_groups={16 groups: [[Users/andrewlamb/Downloads/hits/hits_0.parquet:0..122446530, Users/andrewlamb/Downloads/hits/hits_1.parquet:0..174965044, Users/andrewlamb/Downloads/hits/hits_10.parquet:0..101513258, Users/andrewlamb/Downloads/hits/hits_11.parquet:0..118419888, Users/andrewlamb/Downloads/hits/hits_12.parquet:0..149514164, ...], [Users/andrewlamb/Downloads/hits/hits_14.parquet:108113265..151121699, Users/andrewlamb/Downloads/hits/hits_15.parquet:0..103098894, Users/andrewlamb/Downloads/hits/hits_16.parquet:0..101067219, Users/andrewlamb/Downloads/hits/hits_17.parquet:0..116867853, Users/andrewlamb/Downloads/hits/hits_18.parquet:0..133119589, ...], [Users/andrewlamb/Downloads/hits/hits_21.parquet:3887560..113455196, Users/andrewlamb/Downloads/hits/hits_22.parquet:0..79775901, Users/andrewlamb/Downloads/hits/hits_23.parquet:0..79631107, Users/andrewlamb/Downloads/hits/hits_24.parquet:0..78257049, Users/andrewlamb/Downloads/hits/hits_25.parquet:0..144169728, ...], [Users/andrewlamb/Downloads/hits/hits_28.parquet:106905624..162772407, Users/andrewlamb/Downloads/hits/hits_29.parquet:0..79213288, Users/andrewlamb/Downloads/hits/hits_3.parquet:0..192507052, Users/andrewlamb/Downloads/hits/hits_30.parquet:0..124187913, Users/andrewlamb/Downloads/hits/hits_31.parquet:0..123065410, ...], [Users/andrewlamb/Downloads/hits/hits_35.parquet:54087340..153632381, Users/andrewlamb/Downloads/hits/hits_36.parquet:0..92487304, Users/andrewlamb/Downloads/hits/hits_37.parquet:0..108247781, Users/andrewlamb/Downloads/hits/hits_38.parquet:0..132005180, Users/andrewlamb/Downloads/hits/hits_39.parquet:0..103522954, ...], ...]}, projection=[EventDate, CounterID, URL, IsRefresh, IsLink, IsDownload], predicate=CounterID@6 = 62 AND CAST(CAST(EventDate@5 AS Int32) AS Date32) >= 2013-07-01 AND CAST(CAST(EventDate@5 AS Int32) AS Date32) <= 2013-07-31 AND IsRefresh@15 = 0 AND IsLink@52 != 0 AND IsDownload@53 = 0, pruning_predicate=CASE WHEN CounterID_null_count@2 = CounterID_row_count@3 THEN false ELSE CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 END AND CASE WHEN IsRefresh_null_count@6 = IsRefresh_row_count@7 THEN false ELSE IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 END AND CASE WHEN IsLink_null_count@10 = IsLink_row_count@11 THEN false ELSE IsLink_min@8 != 0 OR 0 != IsLink_max@9 END AND CASE WHEN IsDownload_null_count@14 = IsDownload_row_count@15 THEN false ELSE IsDownload_min@12 <= 0 AND 0 <= IsDownload_max@13 END, required_guarantees=[CounterID in (62), IsDownload in (0), IsLink not in (0), IsRefresh in (0)] |
|               |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.090 seconds.

alamb · 2024-10-30T16:22:41Z

I am just running the sort benchmark too just to verify it didn't get slower

alamb · 2024-10-30T17:22:05Z

Not sure what to make of the sort benchmark results (looks like this PR would slow down in some cases) I will also try to run this to compare main/main to see how noisy it is:

++ critcmp main rrt-spm-upstream
group                                      main                                   rrt-spm-upstream
-----                                      ----                                   ----------------
merge sorted f64                           1.00      4.8±0.02ms        ? ?/sec    1.04      5.0±0.03ms        ? ?/sec
merge sorted i64                           1.00      4.4±0.03ms        ? ?/sec    1.08      4.8±0.02ms        ? ?/sec
merge sorted mixed dictionary tuple        1.00     12.4±0.11ms        ? ?/sec    1.06     13.2±0.07ms        ? ?/sec
merge sorted mixed tuple                   1.03     11.7±0.05ms        ? ?/sec    1.00     11.4±0.07ms        ? ?/sec
merge sorted utf8 dictionary               1.00      4.2±0.04ms        ? ?/sec    1.23      5.2±0.03ms        ? ?/sec
merge sorted utf8 dictionary tuple         1.00      6.1±0.06ms        ? ?/sec    1.23      7.5±0.05ms        ? ?/sec
merge sorted utf8 high cardinality         1.00      6.8±0.04ms        ? ?/sec    1.04      7.1±0.04ms        ? ?/sec
merge sorted utf8 low cardinality          1.00      4.4±0.02ms        ? ?/sec    1.15      5.0±0.04ms        ? ?/sec
merge sorted utf8 tuple                    1.06     12.6±0.08ms        ? ?/sec    1.00     11.8±0.06ms        ? ?/sec
sort f64                                   1.01      3.4±0.01ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec
sort i64                                   1.00      2.9±0.01ms        ? ?/sec    1.10      3.2±0.01ms        ? ?/sec
sort merge f64                             1.00      4.8±0.03ms        ? ?/sec    1.04      5.0±0.03ms        ? ?/sec
sort merge i64                             1.00      4.5±0.03ms        ? ?/sec    1.09      4.9±0.02ms        ? ?/sec
sort merge mixed dictionary tuple          1.00     12.2±0.09ms        ? ?/sec    1.07     13.1±0.10ms        ? ?/sec
sort merge mixed tuple                     1.03     11.8±0.09ms        ? ?/sec    1.00     11.5±0.07ms        ? ?/sec
sort merge utf8 dictionary                 1.00      4.0±0.02ms        ? ?/sec    1.24      4.9±0.02ms        ? ?/sec
sort merge utf8 dictionary tuple           1.00      6.0±0.06ms        ? ?/sec    1.22      7.3±0.06ms        ? ?/sec
sort merge utf8 high cardinality           1.00      7.0±0.03ms        ? ?/sec    1.04      7.2±0.03ms        ? ?/sec
sort merge utf8 low cardinality            1.00      4.5±0.02ms        ? ?/sec    1.15      5.1±0.04ms        ? ?/sec
sort merge utf8 tuple                      1.07     13.2±0.13ms        ? ?/sec    1.00     12.3±0.10ms        ? ?/sec
sort mixed dictionary tuple                1.00     18.4±0.21ms        ? ?/sec    1.03     19.0±0.16ms        ? ?/sec
sort mixed tuple                           1.03     15.2±0.14ms        ? ?/sec    1.00     14.8±0.13ms        ? ?/sec
sort partitioned f64                       1.11  220.0±458.22µs        ? ?/sec    1.00  199.0±309.11µs        ? ?/sec
sort partitioned i64                       1.00  201.0±421.49µs        ? ?/sec    1.01  202.9±429.37µs        ? ?/sec
sort partitioned mixed dictionary tuple    1.00  679.8±346.63µs        ? ?/sec    1.04  704.1±262.66µs        ? ?/sec
sort partitioned mixed tuple               1.00  547.8±146.14µs        ? ?/sec    1.02  557.6±149.19µs        ? ?/sec
sort partitioned utf8 dictionary           1.00  194.8±250.40µs        ? ?/sec    1.09  211.6±418.05µs        ? ?/sec
sort partitioned utf8 dictionary tuple     1.00  601.5±248.90µs        ? ?/sec    1.03  622.0±241.48µs        ? ?/sec
sort partitioned utf8 high cardinality     1.03  402.4±279.21µs        ? ?/sec    1.00  390.0±180.70µs        ? ?/sec
sort partitioned utf8 low cardinality      1.03  364.2±307.28µs        ? ?/sec    1.00  353.7±264.51µs        ? ?/sec
sort partitioned utf8 tuple                1.09  962.8±314.15µs        ? ?/sec    1.00  886.0±186.74µs        ? ?/sec
sort utf8 dictionary                       1.00  1094.0±17.08µs        ? ?/sec    1.00  1094.3±16.54µs        ? ?/sec
sort utf8 dictionary tuple                 1.00     11.4±0.09ms        ? ?/sec    1.13     12.9±0.10ms        ? ?/sec
sort utf8 high cardinality                 1.00     10.2±0.07ms        ? ?/sec    1.02     10.5±0.07ms        ? ?/sec
sort utf8 low cardinality                  1.00      7.3±0.03ms        ? ?/sec    1.10      8.0±0.03ms        ? ?/sec
sort utf8 tuple                            1.05     16.1±0.15ms        ? ?/sec    1.00     15.4±0.11ms        ? ?/sec

ozankabak · 2024-10-30T19:12:19Z

I suspect they are noisy but let's make sure we don't hurt anything. I think we can merge after we clarify that

alamb · 2024-10-30T20:19:12Z

Here is the same benchmark run against a copy of main.

++ critcmp main alamb_main_copy
group                                      alamb_main_copy                        main
-----                                      ---------------                        ----
merge sorted f64                           1.00      4.8±0.02ms        ? ?/sec    1.00      4.8±0.02ms        ? ?/sec
merge sorted i64                           1.00      4.4±0.02ms        ? ?/sec    1.00      4.4±0.04ms        ? ?/sec
merge sorted mixed dictionary tuple        1.01     12.3±0.10ms        ? ?/sec    1.00     12.2±0.06ms        ? ?/sec
merge sorted mixed tuple                   1.00     11.7±0.11ms        ? ?/sec    1.00     11.6±0.04ms        ? ?/sec
merge sorted utf8 dictionary               1.00      4.2±0.02ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
merge sorted utf8 dictionary tuple         1.00      6.0±0.05ms        ? ?/sec    1.00      6.1±0.20ms        ? ?/sec
merge sorted utf8 high cardinality         1.00      6.8±0.02ms        ? ?/sec    1.00      6.8±0.06ms        ? ?/sec
merge sorted utf8 low cardinality          1.00      4.3±0.02ms        ? ?/sec    1.00      4.3±0.03ms        ? ?/sec
merge sorted utf8 tuple                    1.00     12.5±0.07ms        ? ?/sec    1.00     12.5±0.09ms        ? ?/sec
sort f64                                   1.00      3.4±0.02ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec
sort i64                                   1.00      2.9±0.01ms        ? ?/sec    1.00      2.9±0.01ms        ? ?/sec
sort merge f64                             1.00      4.8±0.04ms        ? ?/sec    1.01      4.8±0.03ms        ? ?/sec
sort merge i64                             1.00      4.5±0.02ms        ? ?/sec    1.00      4.5±0.02ms        ? ?/sec
sort merge mixed dictionary tuple          1.00     12.2±0.08ms        ? ?/sec    1.00     12.2±0.09ms        ? ?/sec
sort merge mixed tuple                     1.00     11.8±0.07ms        ? ?/sec    1.00     11.8±0.08ms        ? ?/sec
sort merge utf8 dictionary                 1.00      3.9±0.02ms        ? ?/sec    1.00      3.9±0.02ms        ? ?/sec
sort merge utf8 dictionary tuple           1.00      5.9±0.04ms        ? ?/sec    1.00      5.9±0.04ms        ? ?/sec
sort merge utf8 high cardinality           1.00      7.0±0.03ms        ? ?/sec    1.01      7.0±0.03ms        ? ?/sec
sort merge utf8 low cardinality            1.00      4.4±0.03ms        ? ?/sec    1.01      4.5±0.03ms        ? ?/sec
sort merge utf8 tuple                      1.00     13.0±0.09ms        ? ?/sec    1.00     13.0±0.11ms        ? ?/sec
sort mixed dictionary tuple                1.00     18.4±0.19ms        ? ?/sec    1.00     18.3±0.15ms        ? ?/sec
sort mixed tuple                           1.00     15.2±0.12ms        ? ?/sec    1.00     15.1±0.10ms        ? ?/sec
sort partitioned f64                       1.04  220.4±525.12µs        ? ?/sec    1.00  211.5±418.36µs        ? ?/sec
sort partitioned i64                       1.00  191.4±368.75µs        ? ?/sec    1.13  217.1±556.72µs        ? ?/sec
sort partitioned mixed dictionary tuple    1.00  682.3±247.57µs        ? ?/sec    1.05  718.9±404.40µs        ? ?/sec
sort partitioned mixed tuple               1.03  563.3±220.61µs        ? ?/sec    1.00  547.2±164.45µs        ? ?/sec
sort partitioned utf8 dictionary           1.00  190.4±261.52µs        ? ?/sec    1.10  209.2±414.25µs        ? ?/sec
sort partitioned utf8 dictionary tuple     1.00  575.2±171.84µs        ? ?/sec    1.08  618.7±257.01µs        ? ?/sec
sort partitioned utf8 high cardinality     1.01  386.7±177.46µs        ? ?/sec    1.00  383.1±137.49µs        ? ?/sec
sort partitioned utf8 low cardinality      1.00  338.7±215.73µs        ? ?/sec    1.13  381.9±485.53µs        ? ?/sec
sort partitioned utf8 tuple                1.02  889.2±209.16µs        ? ?/sec    1.00  870.1±206.92µs        ? ?/sec
sort utf8 dictionary                       1.00  1092.9±14.69µs        ? ?/sec    1.00  1091.9±21.58µs        ? ?/sec
sort utf8 dictionary tuple                 1.00     11.3±0.07ms        ? ?/sec    1.00     11.3±0.07ms        ? ?/sec
sort utf8 high cardinality                 1.00     10.1±0.06ms        ? ?/sec    1.00     10.1±0.06ms        ? ?/sec
sort utf8 low cardinality                  1.00      7.3±0.05ms        ? ?/sec    1.00      7.2±0.03ms        ? ?/sec
sort utf8 tuple                            1.00     16.1±0.15ms        ? ?/sec    1.00     16.0±0.14ms        ? ?/sec

This looks a bit more stable.

Given that this change is to the inner loop of SortPreservingMerge it wouldn't surprise me if it got somewhat slower. That being said I think we could also merge this PR and improve performance as a follow on

alamb

Given all the effort that has gone into this PR I think we can merge it. I will try and do some follow ons to add some tests that mimic what we rely on in InfluxDB to make sure this won't introduce regressions for us.

Thank you @jayzhan211 @berkaysynnada and @Dandandan

jayzhan211 · 2024-10-30T23:37:17Z

Thank you all so much 🚀

…ache#13133) * first draft Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add data Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * fix benchmark Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add more bencmark data Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * fix benchmark Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * fmt Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * get max size Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add license Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * rm code for merge Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * cleanup Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * cleanup Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * update poll count only we have tie Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * upd comment Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * fix logic Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * configurable Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * fmt Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add mem limit test Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * rm test Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * escape bracket Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add test Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * rm per consumer record Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * repartition limit Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add benchmark Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * cleanup Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * benchmark with parameter Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * only calculate consumer pool if the limit is set Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * combine eq and gt Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * review part 1 * Update merge.rs * upd doc Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * no need index comparison Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * combine handle tie and eq check Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * upd doc Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * fmt Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add more comment Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * remove flag Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * upd comment Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * Revert "remove flag" This reverts commit 8d6c0a6. * Revert "upd comment" This reverts commit a18cba8. * add more comment Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * add more comment Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * fmt Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * simpliy mem pool Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * clippy Signed-off-by: jayzhan211 <jayzhan211@gmail.com> * Update merge.rs * minor * add comment Signed-off-by: jayzhan211 <jayzhan211@gmail.com> --------- Signed-off-by: jayzhan211 <jayzhan211@gmail.com> Co-authored-by: berkaysynnada <berkay.sahin@synnada.ai>

jayzhan211 and others added 30 commits October 17, 2024 20:51

first draft

4926799

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

add data

666f9fe

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

fix benchmark

449c930

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

add more bencmark data

2566bab

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

fix benchmark

6000d87

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

fmt

5cd8733

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

get max size

74d3d9b

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

add license

e628764

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

rm code for merge

4277eec

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

cleanup

625b925

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

cleanup

e4f9bfe

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

update poll count only we have tie

3fa2e32

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

upd comment

e8da793

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

fix logic

d22ba25

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

Merge branch 'rrt-spm' into rrt-spm-for-merge

aadf69c

configurable

4b2a4ac

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

fmt

920fe6a

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

add mem limit test

d80286c

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

rm test

750261a

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

escape bracket

79a6df0

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

add test

afdd981

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

rm per consumer record

6358843

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

repartition limit

c9337c6

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

Merge branch 'main' of https://github.com/apache/datafusion into rrt-…

dbc1037

…spm-upstream

add benchmark

bc81681

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

cleanup

e4970f5

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

benchmark with parameter

e7abf68

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

only calculate consumer pool if the limit is set

7cb1198

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

combine eq and gt

5f4c83e

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

review part 1

e418d6c

jayzhan211 requested a review from alamb October 30, 2024 00:39

2010YOUY01 reviewed Oct 30, 2024

View reviewed changes

jayzhan211 added 2 commits October 30, 2024 15:31

simpliy mem pool

b652802

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

clippy

f68fa9b

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

github-actions bot removed the execution Related to the execution crate label Oct 30, 2024

berkaysynnada added 2 commits October 30, 2024 16:21

Update merge.rs

b63a631

minor

666b3fe

berkaysynnada approved these changes Oct 30, 2024

View reviewed changes

add comment

003fce3

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

alamb approved these changes Oct 30, 2024

View reviewed changes

jayzhan211 merged commit f23360f into apache:main Oct 30, 2024
26 checks passed

jayzhan211 deleted the rrt-spm-upstream branch October 30, 2024 23:37

This was referenced Nov 4, 2024

Support vectorized append and compare for multi group by #12996

Merged

Nov 5. 2024: This week in DataFusion #13265

Closed

Dandandan mentioned this pull request Nov 17, 2024

Improve performance of ClickBench Q18, Q35, #13449

Open

3 tasks

alamb mentioned this pull request Nov 21, 2024

Patched DataFusion, Oct 30, 2024 (almost) influxdata/arrow-datafusion#44

Closed

alamb mentioned this pull request Dec 18, 2024

Improve SortPreservingMerge::enable_round_robin_repartition docs #13826

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Round robin polling between tied winners in sort preserving merge #13133

Round robin polling between tied winners in sort preserving merge #13133

jayzhan211 commented Oct 27, 2024 •

edited

Loading

jayzhan211 commented Oct 30, 2024 •

edited

Loading

2010YOUY01 Oct 30, 2024 •

edited

Loading

jayzhan211 Oct 30, 2024 •

edited

Loading

alamb commented Oct 30, 2024

jayzhan211 commented Oct 30, 2024

berkaysynnada commented Oct 30, 2024

berkaysynnada left a comment •

edited

Loading

berkaysynnada Oct 30, 2024

jayzhan211 Oct 30, 2024 •

edited

Loading

berkaysynnada Oct 30, 2024

berkaysynnada Oct 30, 2024

alamb commented Oct 30, 2024

alamb commented Oct 30, 2024

alamb commented Oct 30, 2024

alamb commented Oct 30, 2024 •

edited

Loading

ozankabak commented Oct 30, 2024

alamb commented Oct 30, 2024

alamb left a comment

jayzhan211 commented Oct 30, 2024

Round robin polling between tied winners in sort preserving merge #13133

Round robin polling between tied winners in sort preserving merge #13133

Conversation

jayzhan211 commented Oct 27, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Benmark

Existing benchmark

SPM benchmark

Summary

Performance

Memory usage

jayzhan211 commented Oct 30, 2024 • edited Loading

2010YOUY01 Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

jayzhan211 Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

alamb commented Oct 30, 2024

jayzhan211 commented Oct 30, 2024

berkaysynnada commented Oct 30, 2024

berkaysynnada left a comment • edited Loading

Choose a reason for hiding this comment

berkaysynnada Oct 30, 2024

Choose a reason for hiding this comment

jayzhan211 Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

berkaysynnada Oct 30, 2024

Choose a reason for hiding this comment

berkaysynnada Oct 30, 2024

Choose a reason for hiding this comment

alamb commented Oct 30, 2024

alamb commented Oct 30, 2024

alamb commented Oct 30, 2024

alamb commented Oct 30, 2024 • edited Loading

ozankabak commented Oct 30, 2024

alamb commented Oct 30, 2024

alamb left a comment

Choose a reason for hiding this comment

jayzhan211 commented Oct 30, 2024

jayzhan211 commented Oct 27, 2024 •

edited

Loading

jayzhan211 commented Oct 30, 2024 •

edited

Loading

2010YOUY01 Oct 30, 2024 •

edited

Loading

jayzhan211 Oct 30, 2024 •

edited

Loading

berkaysynnada left a comment •

edited

Loading

jayzhan211 Oct 30, 2024 •

edited

Loading

alamb commented Oct 30, 2024 •

edited

Loading