Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P2689: Bounded atomic_ref and atomic_accessor LEWG Presentation 04/30/2024 #456

Open
crtrott opened this issue Apr 29, 2024 · 0 comments
Open

Comments

@crtrott
Copy link
Collaborator

crtrott commented Apr 29, 2024

Atomic Refs Bound to Memory Orderings & Atomic Accessors

The mdspan paper P0009 listed atomic accessors as a reason for having accessors in the first place.

One of the use cases is for parallel algorithms updating data in a way which has data races.

Consider the histogram computation:

template<class ExecT>
void compute_histogram(ExecT exec, float bin_size,
               std::mdspan<int, std::dextents<size_t,1>> output,
               std::mdspan<float, std::dextents<size_t,1>> data) {
  static_assert(std::is_execution_policy_v<ExecT>);

  std::for_each(exec, data.data_handle(), data.data_handle()+data.extent(0), [=](float val) {
    int bin = std::abs(val)/bin_size;
    bin = std::clamp(bin, size_t(0), output.extent(0));
    output[bin]++;
  });

Depending on whether ExecT is sequenced_policy or not, the update needs to happen atomically.

With just atomic_ref one could do:

  std::for_each(exec, data.data_handle(), data.data_handle()+data.extent(0), [=](float val) {
    int bin = std::abs(val)/bin_size;
    bin = std::clamp(bin, size_t(0), output.extent(0));
    atomic_ref(output[bin])++;
  });

This paper proposes a way for doing that for general mdspans without writing out atomic_ref everywhere inside a complex algorithm and also fixes the fact that you can't do the simple ++ with relaxed memory order - enough for these kind of accumulation cases.

What this paper proposes

  • "Bounded" versions of atomic_ref with a fixed (bounded) memory order
    • atomic_ref_relaxed, atomic_ref_seq_cst, atomic_ref_acq_rel
    • Conceptually just like atomic_ref with a fixed memory order - all the same members
  • Corresponding atomic accessors which use the various atomic_ref types as reference type
    • atomic_accessor => atomic_ref
    • atomic_accessor_relaxed => atomic_ref_relaxed
    • atomic_accessor_acq_rel => atomic_ref_acq_rel
    • atomic_accessor_seq_cst => atomic_ref_seq_cst

Histogram with atomic accessor

// sequenced_policy does not attach atomic accessor
template<class T, class Extents, class LayoutPolicy>
auto add_atomic_accessor_if_needed(
    std::execution::sequenced_policy, mdspan<T, Extents, LayoutPolicy> m) {
        return m;
 }

// parallel policies attach atomic accessor:
template<class ExecutionPolicy, class T, class Extents, class LayoutPolicy>
auto add_atomic_accessor_if_needed(
    ExecutionPolicy, mdspan<T, Extents, LayoutPolicy> m) {
        return mdspan(m.data_handle(), m.mapping(), atomic_accessor<T>());
}

template<class ExecT>
void compute_histogram(ExecT exec, float bin_size,
               std::mdspan<int, std::dextents<size_t,1>> output,
               std::mdspan<float, std::dextents<size_t,1>> data) {
  static_assert(std::is_execution_policy_v<ExecT>);

  auto accumulator = add_atomic_accessor_if_needed(exec, output);

  std::for_each(exec, data.data_handle(), data.data_handle()+data.extent(0), [=](float val) {
    int bin = std::abs(val)/bin_size;
    bin = std::clamp(bin, size_t(0), output.extent(0));
    // this is now atomic for parallel policies
    accumulator[bin]++;
  });

Previous Question: Why not expose the templated type?

  • There is really no use-case where one would want to write something generic over the memory order!
  • The algorithm determines the memory order you need - the only generic question we identified is whether you need atomics at all (i.e. what execution policy is used)
  • SG1 also felt that its easier to see the memory order with the explicit names
    • We don't feel there is any real risk here and mdspan somewhat hides its accessor anyway

Wording Considerations

  • Used exposition only atomic-ref-bound<T,MemoryOrder> for wording
    • atomic_ref_MEM_ORDER are template aliases - see open question whether this is ok
  • atomic_ref has 4 specializations: general, integral (not bool), floating point and pointer
    • atomic-ref-bound needs only 2 (general and arithmetic/pointer{not-bool} versions)
      • integral, floating point and pointer type versions can be defined in terms of atomic_ref with constraints on member functions
      • the only reason we need general is difference_type member typedef -> doesn't exist in atomic_ref general

Open Questions

  • Is reasoning for having atomic-ref-bound exposition only good enough, or does LEWG want it as a public thing?
    • Please Poll: atomic-ref-bound should be exposition only.
    • Author position: in favor
  • Converting between atomic-ref-bound and basic-atomic-accessors of different memory orders
    • We don't see many good reasons to have them:
      • atomic_ref is meant to be almost used in-place for individual operations e.g. atomic_ref(a[i])++
      • atomic accessors are something you are supposed to add in a fairly local scope
      • we don't expect folks to pass mdspan with atomic accessors around
    • There is a reason not to permit it:
      • Writing functions which take atomic_ref_MEM_ORDER imply ordering behavior: allowing conversions would change that behavior
    • Please Poll: atomic-ref-bound and basic-atomic-accessor should have converting constructors for different memory-orders
      • Author position: against
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant