Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable copy/move ctors and operator= from free_list classes #862

Merged
merged 3 commits into from
Sep 1, 2021

Conversation

harrism
Copy link
Member

@harrism harrism commented Sep 1, 2021

Fixes #861.

An implicit copy of free_list was being used instead of a reference, which led to duplicate allocations. This never manifested until after #851 because previously the locally modified copy of a free list was being merged into an MR-owned free list. When we removed one of the places where we merged free lists, this copy resulted in the changes to free lists being lost. This only manifested in PTDS usage, but likely would also manifest in use cases with multiple non-default streams.

@harrism harrism requested a review from a team as a code owner September 1, 2021 07:35
@harrism harrism requested review from rongou and codereport September 1, 2021 07:35
@github-actions github-actions bot added the cpp Pertains to C++ code label Sep 1, 2021
@harrism harrism requested review from jlowe and jrhemstad September 1, 2021 07:36
@harrism harrism self-assigned this Sep 1, 2021
@harrism harrism added bug Something isn't working non-breaking Non-breaking change labels Sep 1, 2021
@harrism
Copy link
Member Author

harrism commented Sep 1, 2021

I've noticed some impact on the MULTI_STREAM_ALLOCATIONS_BENCH. Sometimes there appears to be some oversynchronization effect on performance. Here I ran it twice in a row, and the first run got the perf I expected, but the 4-stream version ran slower the second time:

(rapids) rapids@compose:~/rmm/build/release$ gbenchmarks/MULTI_STREAM_ALLOCATIONS_BENCH -r pool          
2021-09-01T17:55:24+10:00
Running gbenchmarks/MULTI_STREAM_ALLOCATIONS_BENCH
Run on (16 X 3600 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x2)
Load Average: 2.26, 2.67, 2.41
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
--------------------------------------------------------------------------------------------------
Benchmark                                        Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------
BM_MultiStreamAllocations/pool_mr/1/4/0       2832 us         2831 us          247 items_per_second=1.41287k/s
BM_MultiStreamAllocations/pool_mr/2/4/0       1447 us         1447 us          475 items_per_second=2.76504k/s
BM_MultiStreamAllocations/pool_mr/4/4/0        734 us          734 us          939 items_per_second=5.45189k/s

(rapids) rapids@compose:~/rmm/build/release$ gbenchmarks/MULTI_STREAM_ALLOCATIONS_BENCH -r pool
2021-09-01T17:55:38+10:00
Running gbenchmarks/MULTI_STREAM_ALLOCATIONS_BENCH
Run on (16 X 3600 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x2)
Load Average: 2.12, 2.62, 2.40
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
--------------------------------------------------------------------------------------------------
Benchmark                                        Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------
BM_MultiStreamAllocations/pool_mr/1/4/0       2875 us         2874 us          247 items_per_second=1.39185k/s
BM_MultiStreamAllocations/pool_mr/2/4/0       1489 us         1488 us          444 items_per_second=2.6878k/s
BM_MultiStreamAllocations/pool_mr/4/4/0       2165 us         2164 us          941 items_per_second=1.84812k/s

Copy link
Contributor

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent find! I verified the C++ and Java unit tests pass with PTDS enabled after this change.

@jrhemstad
Copy link
Contributor

@gpucibot merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cpp Pertains to C++ code non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] cudf unit tests fail with PTDS enabled
5 participants