Enable fractional null probability for hashing benchmark #13967

Blonck · 2023-08-25T18:37:20Z

In the past, the HASING_NVBENCH benchmark treated the nulls parameter as a boolean. Any value other than 0.0 resulted in a null probability of 1.0.

Now, the nulls parameter directly determines the null probability. For instance, a value of 0.1 will generate 10% of the data as null. Moreover, setting nulls to 0.0 produces data without a null bitmask.

Additionally, bytes_per_second are added to the benchmark.

This patch relates to #13735.

Checklist

I am familiar with the Contributing Guidelines.

In the past, the `HASING_NVBENCH` benchmark treated the `nulls` parameter as a boolean. Any value other than 0.0 resulted in a null probability of 100% for the generated data. Now, the `nulls` parameter directly determines the null probability. For instance, a value of 0.1 will generate 10% of the data as null. Moreover, setting nulls to 0.0 produces data without a null bitmask. Additionally, `bytes_per_second` are added to the benchmark. This patch relates to rapidsai#13735.

rapids-bot · 2023-08-25T18:37:24Z

Pull requests from external contributors require approval from a rapidsai organization member with write permissions or greater before CI can begin.

Blonck · 2023-08-25T18:46:05Z

Hi, I wasn't sure about this one, thus the draft.

For me, it looked like the nulls parameter was accidentally cast to bool. Thus either all or no values of the table were invalid. In particular, the benchmark was configured for nulls = [0, 0.1]. Also if nulls == 0.0, there will no longer be a null bitmask generated.

Hope that makes sense, if not I will just remove my changes and just add the bytes_per_second calculation 😄.

davidwendt · 2023-08-25T18:59:17Z

Hi, I wasn't sure about this one, thus the draft.

For me, it looked like the nulls parameter was accidentally cast to bool. Thus either all or no values of the table were invalid. In particular, the benchmark was configured for nulls = [0, 0.1]. Also if nulls == 0.0, there will no longer be a null bitmask generated.

Hope that makes sense, if not I will just remove my changes and just add the bytes_per_second calculation 😄.

Yes. It looks like you fixed a bug here.

PointKernel · 2023-08-25T21:46:01Z

/ok to test

PointKernel

Several typos otherwise look great. @Blonck Can you please also paste the console output of those benchmarks in the PR just for reference?

cpp/benchmarks/hashing/hash.cpp

Co-authored-by: Yunsong Wang <yunsongw@nvidia.com>

Blonck · 2023-08-26T05:31:20Z

Sure @PointKernel, here is the log. Please note that I'm currently using WSL to compile and run the code. Therefore, the performance metrics might not be fully representative. I'm uncertain about the extent to which WSL/Windows affects performance. At least it significantly impacts compile time :).

# Devices

## [0] `NVIDIA GeForce RTX 4070 Ti`
* SM Version: 890 (PTX Version: 860)
* Number of SMs: 60
* SM Default Clock Rate: 18446744072024 MHz
* Global Memory: 11032 MiB Free / 12281 MiB Total
* Global Memory Bus Peak: 504 GB/sec (192-bit DDR @10501MHz)
* Max Shared Memory: 100 KiB/SM, 48 KiB/Block
* L2 Cache Size: 49152 KiB
* Maximum Active Blocks: 24/SM
* Maximum Active Threads: 1536/SM, 1024/Block
* Available Registers: 65536/SM, 65536/Block
* ECC Enabled: No

# Log

RMM memory resource = pool
Run:  [1/12] hashing [Device=0 num_rows=65536 nulls=0 hash_name=murmurhash3_x86_32]
Pass: Cold: 0.171495ms GPU, 0.204221ms CPU, 0.50s total GPU, 1.16s total wall, 2928x 
Run:  [2/12] hashing [Device=0 num_rows=16777216 nulls=0 hash_name=murmurhash3_x86_32]
Pass: Cold: 1.457714ms GPU, 1.524122ms CPU, 3.76s total GPU, 4.37s total wall, 2576x 
Run:  [3/12] hashing [Device=0 num_rows=65536 nulls=0.1 hash_name=murmurhash3_x86_32]
Pass: Cold: 0.382451ms GPU, 0.410197ms CPU, 0.50s total GPU, 0.79s total wall, 1312x 
Run:  [4/12] hashing [Device=0 num_rows=16777216 nulls=0.1 hash_name=murmurhash3_x86_32]
Pass: Cold: 1.614988ms GPU, 1.670856ms CPU, 0.96s total GPU, 1.09s total wall, 592x 
Run:  [5/12] hashing [Device=0 num_rows=65536 nulls=0 hash_name=md5]
Pass: Cold: 0.394157ms GPU, 0.425295ms CPU, 0.50s total GPU, 0.76s total wall, 1280x 
Run:  [6/12] hashing [Device=0 num_rows=16777216 nulls=0 hash_name=md5]
Warn: Current measurement timed out (15.00s) while over noise threshold (2.04% > 0.50%)
Pass: Cold: 12.810189ms GPU, 12.892377ms CPU, 14.67s total GPU, 15.00s total wall, 1145x 
Run:  [7/12] hashing [Device=0 num_rows=65536 nulls=0.1 hash_name=md5]
Pass: Cold: 0.444732ms GPU, 0.474608ms CPU, 0.51s total GPU, 0.74s total wall, 1136x 
Run:  [8/12] hashing [Device=0 num_rows=16777216 nulls=0.1 hash_name=md5]
Warn: Current measurement timed out (15.00s) while over noise threshold (4.29% > 0.50%)
Pass: Cold: 13.082924ms GPU, 13.164643ms CPU, 14.67s total GPU, 15.00s total wall, 1121x 
Run:  [9/12] hashing [Device=0 num_rows=65536 nulls=0 hash_name=spark_murmurhash3_x86_32]
Pass: Cold: 0.156733ms GPU, 0.191434ms CPU, 0.50s total GPU, 1.21s total wall, 3216x 
Run:  [10/12] hashing [Device=0 num_rows=16777216 nulls=0 hash_name=spark_murmurhash3_x86_32]
Pass: Cold: 1.478015ms GPU, 1.533411ms CPU, 1.77s total GPU, 2.05s total wall, 1200x 
Run:  [11/12] hashing [Device=0 num_rows=65536 nulls=0.1 hash_name=spark_murmurhash3_x86_32]
Pass: Cold: 0.402708ms GPU, 0.437331ms CPU, 0.50s total GPU, 0.79s total wall, 1248x 
Run:  [12/12] hashing [Device=0 num_rows=16777216 nulls=0.1 hash_name=spark_murmurhash3_x86_32]
Pass: Cold: 1.618771ms GPU, 1.678639ms CPU, 1.37s total GPU, 1.56s total wall, 848x 

# Benchmark Results

## hashing

### [0] NVIDIA GeForce RTX 4070 Ti

| num_rows | nulls |        hash_name         | Samples |  CPU Time  | Noise  |  GPU Time  | Noise  | GlobalMem BW | BWUtil |
|----------|-------|--------------------------|---------|------------|--------|------------|--------|--------------|--------|
|    65536 |     0 |       murmurhash3_x86_32 |   2928x | 204.221 us | 88.92% | 171.495 us | 87.16% |  10.697 GB/s |  2.12% |
| 16777216 |     0 |       murmurhash3_x86_32 |   2576x |   1.524 ms | 17.96% |   1.458 ms | 16.59% | 321.973 GB/s | 63.88% |
|    65536 |   0.1 |       murmurhash3_x86_32 |   1312x | 410.197 us | 43.65% | 382.451 us | 45.51% |   4.398 GB/s |  0.87% |
| 16777216 |   0.1 |       murmurhash3_x86_32 |    592x |   1.671 ms | 10.91% |   1.615 ms |  9.95% | 266.156 GB/s | 52.80% |
|    65536 |     0 |                      md5 |   1280x | 425.295 us | 41.19% | 394.157 us | 42.69% |   9.310 GB/s |  1.85% |
| 16777216 |     0 |                      md5 |   1145x |  12.892 ms |  2.18% |  12.810 ms |  2.04% |  73.309 GB/s | 14.54% |
|    65536 |   0.1 |                      md5 |   1136x | 474.608 us | 36.73% | 444.732 us | 38.77% |   7.908 GB/s |  1.57% |
| 16777216 |   0.1 |                      md5 |   1121x |  13.165 ms |  4.33% |  13.083 ms |  4.29% |  68.761 GB/s | 13.64% |
|    65536 |     0 | spark_murmurhash3_x86_32 |   3216x | 191.434 us | 61.58% | 156.733 us | 59.51% |  11.704 GB/s |  2.32% |
| 16777216 |     0 | spark_murmurhash3_x86_32 |   1200x |   1.533 ms | 32.23% |   1.478 ms | 31.72% | 317.550 GB/s | 63.00% |
|    65536 |   0.1 | spark_murmurhash3_x86_32 |   1248x | 437.331 us | 44.13% | 402.708 us | 45.15% |   4.177 GB/s |  0.83% |
| 16777216 |   0.1 | spark_murmurhash3_x86_32 |    848x |   1.679 ms | 10.75% |   1.619 ms |  9.30% | 265.534 GB/s | 52.68% |

davidwendt · 2023-08-27T00:18:05Z

/ok to test

PointKernel

LGTM

PointKernel · 2023-08-29T18:49:57Z

/ok to test

copy-pr-bot · 2023-08-29T19:06:46Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

cpp/benchmarks/hashing/hash.cpp

Code review suggestions. Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>

davidwendt · 2023-08-30T15:18:32Z

/ok to test

PointKernel · 2023-08-30T18:55:25Z

/ok to test

PointKernel · 2023-08-30T22:29:28Z

/merge

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Aug 25, 2023

Blonck marked this pull request as ready for review August 25, 2023 19:18

Blonck requested a review from a team as a code owner August 25, 2023 19:18

Blonck requested review from hyperbolic2346 and PointKernel August 25, 2023 19:18

PointKernel added feature request New feature or request non-breaking Non-breaking change labels Aug 25, 2023

Merge branch 'branch-23.10' into processed_bytes_hash_nvbench

2a5c6db

PointKernel requested changes Aug 25, 2023

View reviewed changes

cpp/benchmarks/hashing/hash.cpp Outdated Show resolved Hide resolved

cpp/benchmarks/hashing/hash.cpp Outdated Show resolved Hide resolved

cpp/benchmarks/hashing/hash.cpp Outdated Show resolved Hide resolved

Fix memory statistics for hashing benchmark

dc2556b

Co-authored-by: Yunsong Wang <yunsongw@nvidia.com>

Add missing include in hashing benmchmark

c229367

PointKernel approved these changes Aug 28, 2023

View reviewed changes

Blonck and others added 2 commits August 28, 2023 21:59

Merge branch 'branch-23.10' into processed_bytes_hash_nvbench

495f674

Merge branch 'branch-23.10' into processed_bytes_hash_nvbench

9a930cb

Merge branch 'branch-23.10' into processed_bytes_hash_nvbench

41303b3

davidwendt requested changes Aug 30, 2023

View reviewed changes

cpp/benchmarks/hashing/hash.cpp Outdated Show resolved Hide resolved

Change wording in hashing benchmark

47d8d84

Code review suggestions. Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>

davidwendt approved these changes Aug 30, 2023

View reviewed changes

Merge branch 'branch-23.10' into processed_bytes_hash_nvbench

74d798f

rapids-bot bot merged commit c73ff70 into rapidsai:branch-23.10 Aug 30, 2023
54 checks passed

GregoryKimball mentioned this pull request Nov 3, 2023

[FEA] Add bytes_per_second to all libcudf benchmarks #13735

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable fractional null probability for hashing benchmark #13967

Enable fractional null probability for hashing benchmark #13967

Blonck commented Aug 25, 2023

rapids-bot bot commented Aug 25, 2023

Blonck commented Aug 25, 2023

davidwendt commented Aug 25, 2023

PointKernel commented Aug 25, 2023

PointKernel left a comment

Blonck commented Aug 26, 2023 •

edited

Loading

davidwendt commented Aug 27, 2023

PointKernel left a comment

PointKernel commented Aug 29, 2023

copy-pr-bot bot commented Aug 29, 2023

davidwendt commented Aug 30, 2023

PointKernel commented Aug 30, 2023

PointKernel commented Aug 30, 2023

Enable fractional null probability for hashing benchmark #13967

Enable fractional null probability for hashing benchmark #13967

Conversation

Blonck commented Aug 25, 2023

Checklist

rapids-bot bot commented Aug 25, 2023

Blonck commented Aug 25, 2023

davidwendt commented Aug 25, 2023

PointKernel commented Aug 25, 2023

PointKernel left a comment

Choose a reason for hiding this comment

Blonck commented Aug 26, 2023 • edited Loading

davidwendt commented Aug 27, 2023

PointKernel left a comment

Choose a reason for hiding this comment

PointKernel commented Aug 29, 2023

copy-pr-bot bot commented Aug 29, 2023

davidwendt commented Aug 30, 2023

PointKernel commented Aug 30, 2023

PointKernel commented Aug 30, 2023

Blonck commented Aug 26, 2023 •

edited

Loading