[Draft] Refactor GPU reductions and add unsafe atomic tunings #247

MrBurmark · 2022-06-03T05:31:57Z

Refactor GPU reductions and add unsafe atomic tunings

This adds a "unsafeAtomic" tuning of each of the kernels with a hip variant using atomics.
This also refactors gpu reductions so the implementation is not duplicated in each kernel with a reduction.
This also adds lambda variants of gpu reduction kernels.

This PR is a refactoring and feature
It does the following (modify list as needed):
- refactors gpu reduction code to avoid duplication
- Adds "unsafeAtomic" hip atomic tunings at the request of myself

Use atomic fucntions directly in Base and Lambda variants instead of using RAJA atomics. Add a variety of util functions.

Use single macro for all single reduction kernels instead of duplicating the code

Now gpu kernels using atomics have a default tuning called atomic and tunings with unsafe atomics have a tuning called unsafeAtomics.

This makes it clear where atomics are being used as part of a reduction or not

rhornung67

A lot of changes in this PR. I think I understand them all.

MrBurmark · 2022-06-04T00:10:01Z

I kept finding things to change...
I have more in some more in mind for a future PR too.

CRobeck · 2022-06-07T20:17:44Z

src/algorithm/REDUCE_SUM-Hip.cpp

-#endif
+template < size_t block_size >
+__launch_bounds__(block_size)
+__global__ void reduce_sum_unsafe(Real_ptr x, Real_ptr dsum, Real_type sum_init,


If we start to add more tests with atomics we might not want to split these up into separate kernels for compile time concerns but should be OK for now.

CRobeck · 2022-06-07T20:19:06Z

src/common/HipDataUtils.hpp

+  return devProp.gcnArchName;
+}
+
+#if defined(__gfx90a__)


Will be able to do away with gfx90a arch check in future Rocm release.

MrBurmark · 2023-11-21T20:54:15Z

I'm closing this PR for a number of reasons. Its worth differentiating between safe and unsafe atomics as that is a temporary issue. The reducer implementation is difficult to put in a macro and not identically duplicated, so its not a huge gain to abstract it. The lambda variants are not necessary.

MrBurmark added 27 commits June 2, 2022 13:37

Add unsafe hip atomic tuning of REDUCE_SUM

9a96216

Use atomic fucntions directly in Base and Lambda variants instead of using RAJA atomics. Add a variety of util functions.

Add unsafe tunings of multiple kernels

a0f52f5

Add more unsafe tunings of kernels

a0e21c0

Use local pi_init in PI_REDUCE

5bbb053

Move PI_REDUCE cuda impl into macro

20354ee

Add PI_REDUCE gpu lambda variants

2a17be7

Fix guards on CAS based cuda atomics

be2b6fb

Fix typo in REDUCE_STRUCT cuda

de5994a

Use local init in REDUCE_STRUCT

720dac9

Fix atomics used in REDUCE_STRUCT Hip

eea7a96

Add lambda gpu variants of REDUCE_STRUCT

5ec96ed

Refactor indices in base reductions

a05444c

Use local sum init in REDUCE_SUM

233b0a2

Add gpu lambda impl in REDUCE_SUM

9adf731

Fix some var names in NODAL_ACCUMULATION_3D

7e36650

Add gpu lambda impl of NODAL_ACCUMULATION_3D

7c2d2d3

Use local init in REDUCE3_INT

2f7cc2d

Implement gpu lambda variants of REDUCE3_INT

1dde239

Use local init in TRAP_INT

5b321b7

Add gpu lambda variant impl in TRAP_INT

6ffd73f

Use local init in DOT

547282e

Add gpu lambda variant impl in DOT

7464433

Refactor reduce implementation into macro

5ee08c6

Use single macro for all single reduction kernels instead of duplicating the code

Refactor gpu reduce 3 into macro

69c7ea5

Refactor gpu reduce 6 into macro

d85bcfe

Move CAS cuda atomic impls into CudaDataUtils

7f271ce

Add documentation to gpu utils

db37b04

MrBurmark requested review from artv3, rhornung67 and CRobeck June 3, 2022 05:31

MrBurmark marked this pull request as ready for review June 3, 2022 05:32

MrBurmark added 5 commits June 3, 2022 08:49

Rename tunings with atomics

7f3afd3

Now gpu kernels using atomics have a default tuning called atomic and tunings with unsafe atomics have a tuning called unsafeAtomics.

Rename PI_ATOMIC gpu tuning to sumAtomic

0ed29a0

Split reduce tunings into reduce and reduceAtomic

095fdec

This makes it clear where atomics are being used as part of a reduction or not

Add raja reduce atomic tuning in REDUCE_SUM

74a4c85

Add more raja reduce atomic tunings

e5c4002

rhornung67 approved these changes Jun 3, 2022

View reviewed changes

CRobeck reviewed Jun 7, 2022

View reviewed changes

rhornung67 mentioned this pull request Jul 10, 2023

v2023.06.0 Release #344

Closed

24 tasks

MrBurmark changed the title ~~Add support for hip unsafe atomics~~ [Draft] Refactor GPU reductions and add unsafe atomic tunings Jul 18, 2023

MrBurmark mentioned this pull request Jul 18, 2023

Think about how to handle rarely used tunings #350

Open

MrBurmark closed this Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Refactor GPU reductions and add unsafe atomic tunings #247

[Draft] Refactor GPU reductions and add unsafe atomic tunings #247

MrBurmark commented Jun 3, 2022 •

edited

Loading

rhornung67 left a comment

MrBurmark commented Jun 4, 2022

CRobeck Jun 7, 2022 •

edited

Loading

CRobeck Jun 7, 2022 •

edited

Loading

MrBurmark commented Nov 21, 2023

[Draft] Refactor GPU reductions and add unsafe atomic tunings #247

[Draft] Refactor GPU reductions and add unsafe atomic tunings #247

Conversation

MrBurmark commented Jun 3, 2022 • edited Loading

Refactor GPU reductions and add unsafe atomic tunings

rhornung67 left a comment

Choose a reason for hiding this comment

MrBurmark commented Jun 4, 2022

CRobeck Jun 7, 2022 • edited Loading

Choose a reason for hiding this comment

CRobeck Jun 7, 2022 • edited Loading

Choose a reason for hiding this comment

MrBurmark commented Nov 21, 2023

MrBurmark commented Jun 3, 2022 •

edited

Loading

CRobeck Jun 7, 2022 •

edited

Loading

CRobeck Jun 7, 2022 •

edited

Loading