Commit 93de66a
ARROW-10837: [Rust][DataFusion] Use
This PR is a follow up of apache/arrow#8765 . Here, the hashmap values for the key are converted to `Vec<u8>` to use as key instead.
This is a bit faster as both hashing and cloning will be faster. It will also use less additional memory than the earlier usage of the dynamic `GroupByScalar` values (for hash join).
[This PR]
```
Query 12 iteration 0 took 1315 ms
Query 12 iteration 1 took 1324 ms
Query 12 iteration 2 took 1329 ms
Query 12 iteration 3 took 1334 ms
Query 12 iteration 4 took 1335 ms
Query 12 iteration 5 took 1338 ms
Query 12 iteration 6 took 1337 ms
Query 12 iteration 7 took 1349 ms
Query 12 iteration 8 took 1348 ms
Query 12 iteration 9 took 1358 ms
```
[Master]
```
Query 12 iteration 0 took 1379 ms
Query 12 iteration 1 took 1383 ms
Query 12 iteration 2 took 1401 ms
Query 12 iteration 3 took 1406 ms
Query 12 iteration 4 took 1420 ms
Query 12 iteration 5 took 1435 ms
Query 12 iteration 6 took 1401 ms
Query 12 iteration 7 took 1404 ms
Query 12 iteration 8 took 1418 ms
Query 12 iteration 9 took 1416 ms
```
[This PR]
```
Query 1 iteration 0 took 871 ms
Query 1 iteration 1 took 866 ms
Query 1 iteration 2 took 869 ms
Query 1 iteration 3 took 869 ms
Query 1 iteration 4 took 867 ms
Query 1 iteration 5 took 874 ms
Query 1 iteration 6 took 870 ms
Query 1 iteration 7 took 875 ms
Query 1 iteration 8 took 871 ms
Query 1 iteration 9 took 869 ms
```
[Master]
```
Query 1 iteration 0 took 1189 ms
Query 1 iteration 1 took 1192 ms
Query 1 iteration 2 took 1189 ms
Query 1 iteration 3 took 1185 ms
Query 1 iteration 4 took 1193 ms
Query 1 iteration 5 took 1202 ms
Query 1 iteration 6 took 1547 ms
Query 1 iteration 7 took 1242 ms
Query 1 iteration 8 took 1202 ms
Query 1 iteration 9 took 1197 ms
```
FWIW, micro benchmark results for aggregate queries:
```
aggregate_query_no_group_by 15 12
time: [538.54 us 541.48 us 544.74 us]
change: [+5.4384% +6.6260% +8.2034%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
7 (7.00%) high mild
1 (1.00%) high severe
aggregate_query_no_group_by_count_distinct_wide 15 12
time: [4.8418 ms 4.8744 ms 4.9076 ms]
change: [-13.890% -12.580% -11.260%] (p = 0.00 < 0.05)
Performance has improved.
aggregate_query_no_group_by_count_distinct_narrow 15 12
time: [2.1910 ms 2.2100 ms 2.2291 ms]
change: [-30.490% -28.886% -27.271%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking aggregate_query_group_by 15 12: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, enable flat sampling, or reduce sample count to 50.
aggregate_query_group_by 15 12
time: [1.5905 ms 1.5977 ms 1.6054 ms]
change: [-18.271% -16.780% -15.396%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
aggregate_query_group_by_with_filter 15 12
time: [788.26 us 792.05 us 795.74 us]
change: [-9.8088% -8.5606% -7.4141%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Benchmarking aggregate_query_group_by_u64 15 12: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.3s, enable flat sampling, or reduce sample count to 50.
aggregate_query_group_by_u64 15 12
time: [1.8502 ms 1.8565 ms 1.8630 ms]
change: [+8.6203% +9.8872% +10.973%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
aggregate_query_group_by_with_filter_u64 15 12
time: [777.83 us 782.75 us 788.15 us]
change: [-7.5157% -6.6393% -5.6558%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
```
FYI @jorgecarleitao
Closes #8863 from Dandandan/key_byte_vec
Lead-authored-by: Heres, Daniel <danielheres@gmail.com>
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Signed-off-by: Jorge C. Leitao <jorgecarleitao@gmail.com>Vec<u8> for hash keys1 parent 8f1931a commit 93de66a
File tree
2 files changed
+93
-27
lines changed- rust/datafusion/src/physical_plan
2 files changed
+93
-27
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | | - | |
47 | | - | |
| 46 | + | |
| 47 | + | |
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
| |||
245 | 245 | | |
246 | 246 | | |
247 | 247 | | |
248 | | - | |
| 248 | + | |
249 | 249 | | |
250 | | - | |
| 250 | + | |
251 | 251 | | |
252 | 252 | | |
253 | | - | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
254 | 256 | | |
255 | 257 | | |
256 | 258 | | |
| |||
263 | 265 | | |
264 | 266 | | |
265 | 267 | | |
| 268 | + | |
266 | 269 | | |
267 | 270 | | |
268 | 271 | | |
269 | 272 | | |
270 | | - | |
| 273 | + | |
271 | 274 | | |
272 | 275 | | |
273 | 276 | | |
274 | 277 | | |
275 | | - | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
276 | 283 | | |
277 | 284 | | |
278 | 285 | | |
| |||
284 | 291 | | |
285 | 292 | | |
286 | 293 | | |
287 | | - | |
| 294 | + | |
288 | 295 | | |
289 | 296 | | |
290 | 297 | | |
| |||
391 | 398 | | |
392 | 399 | | |
393 | 400 | | |
394 | | - | |
| 401 | + | |
395 | 402 | | |
396 | 403 | | |
397 | 404 | | |
| |||
646 | 653 | | |
647 | 654 | | |
648 | 655 | | |
649 | | - | |
| 656 | + | |
650 | 657 | | |
651 | 658 | | |
652 | | - | |
| 659 | + | |
653 | 660 | | |
654 | 661 | | |
655 | 662 | | |
| |||
726 | 733 | | |
727 | 734 | | |
728 | 735 | | |
729 | | - | |
730 | | - | |
| 736 | + | |
| 737 | + | |
731 | 738 | | |
732 | 739 | | |
733 | 740 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
32 | 34 | | |
33 | | - | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
34 | 41 | | |
35 | 42 | | |
36 | 43 | | |
37 | 44 | | |
38 | 45 | | |
39 | 46 | | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
| 47 | + | |
44 | 48 | | |
45 | 49 | | |
46 | 50 | | |
| |||
52 | 56 | | |
53 | 57 | | |
54 | 58 | | |
55 | | - | |
| 59 | + | |
56 | 60 | | |
57 | 61 | | |
58 | 62 | | |
| |||
205 | 209 | | |
206 | 210 | | |
207 | 211 | | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | 212 | | |
214 | 213 | | |
215 | 214 | | |
| |||
318 | 317 | | |
319 | 318 | | |
320 | 319 | | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
321 | 381 | | |
322 | 382 | | |
323 | 383 | | |
| |||
370 | 430 | | |
371 | 431 | | |
372 | 432 | | |
373 | | - | |
374 | | - | |
375 | | - | |
| 433 | + | |
| 434 | + | |
376 | 435 | | |
377 | 436 | | |
378 | 437 | | |
| |||
0 commit comments