Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add design doc for atomic<T> type. #5101

Merged
merged 5 commits into from
Sep 19, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions docs/proposals/003-atomic-t.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
SP #003 - `Atomic<T>` type
==============


Status
------

Author: Yong He

Status: Design Discussion.

Implementation: N/A

Reviewed by: N/A

Background
----------

HLSL defines atomic intrinsics to work on free references to ordinary values such as `int` and `float`. However, this doesn't translate well to Metal and WebGPU,
which defines `atomic<T>` type and only allow atomic operations to be applied on values of `atomic<T>` types.

Slang's Metal backend follows the same technique in SPIRV-Cross and DXIL->Metal converter that relies on a C++ undefined behavior that casts an ordinary `int*` pointer to a `atomic<int>*` pointer
and then call atomic intrinsic on the reinterpreted pointer. This is fragile and not guaranteed to work in the future.

To make the situation worse, WebGPU bans all possible ways to cast a normal pointer into an `atomic` pointer. In order to provide a truly portable way to define
atomic operations and allow them to be translatable to all targets, we will also need an `atomic<T>` type in Slang that maps to `atomic<T>` in WGSL and Metal, and maps to
`T` for HLSL/SPIRV.


Proposed Approach
-----------------

We define an `Atomic<T>` type that functions as a wrapper of `T` and provides atomic operations:
```
interface IAtomicable {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a problem to solve with this proposal, but in the long run we really ought to stop using empty interfaces for stuff like this, and then relying on intrinsics that are constrained on the interface.

Ideally, built-in interfaces like this should actually define their requirements explicitly, like:

interface IAtomicable
{
    This atomicLoad( ref Atomic<This> location );
    T atomicExchange( ref Atomic<This> location, This newValue );
    // ...
}

We would then make the Atomic<T> type have entirely concrete method definitions (with hints to ensure they get inlined whenever possible), and the conformances for concrete types like int and uint would then define the required operations explicitly (whether as intrinsics, or using target_switch, etc.).

Such an approach would not only be more "correct," but it also opens the door to having certain types conform in non-intrinsic ways, or even allowing user-defined conformances (e.g., to allow a user-defined type to opt into Atomic<T> support if its in-memory layout is bit-identical to a built-in atomic type).

extension int : IAtomicable {}
extension uint : IAtomicable {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to add 64-bit integer since SM 6.6 supports it.

extension float : IAtomicable {}
extension half : IAtomicable {}

struct Atomic<T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't there a constraint of T : IAtomicable here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

{
T load();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you intend to support syntactic sugar where you can use the Atomic as an lvalue/rvalue and assign without using load/store? This is pretty common.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we will need that.

[ref] void store(T newValue); // Question: do we really need this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you surely want this, atomic stores can often be lowered to a much cheaper hardware instruction than whatever you would replace it with (e.g. atomic exchange).

[ref] T exchange(T newValue); // returns old value
[ref] T compareExchange(T compareValue, T newValue); // returns old value.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are sometimes two forms of compareExchange (strong vs weak). Should say which this is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to SPIRV spec, the OpAtomicCompareExchangeWeak is deprecated and we only have OpAtomicCompareExchange, so perhaps it is fine to just assume it means weak here?

[ref] T atomicAdd(T value); // returns original value
[ref] T atomicSub(T value); // returns original value
[ref] T atomicMax(T value); // returns original value
[ref] T atomicMin(T value); // returns original value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, the only truly universal operations that an atomicable type would need to support are load/store and exchange. Compare-exchange relies on the type being comparable (and I'd need to double-check how comparisons are done for floating-point atomic compare-exchange), and the add/sub/min/max operations rely on the type supporting the relevant mathematical operations.

It seems like we might actually need a hierarchy of atomic-related interfaces, representing specific subsets of the available functionality.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added IAtomicable hierarchy.

}

extension<T> Atomic<T>
where T : IAtomicable
where T : __BuiltinIntegerType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what this is doing, but I also worry a little bit, when I think about how this would translate to my "move the required operations into the interface" approach. Effectively, this is saying that any type that conforms IAtomicable & __BuiltinIntegerType has additional requirements beyond those that are stated for either interface alone.

I really do think the right answer here will be a hierarchy of interfaces, along the lines of:

interface IAtomicable { ... } // load/store and exchange
interface IAtomicCompareExchangeable : IAtomicable { ... } // compare-exchange
interface IAtomicNumeric : IAtomicCompareExchangeable { ... } // add/sub/min/max
interface IAtomicLogical : IAtomicCompareExchangeable { ... } // and/or/xor
typealias IAtomicInteger = IAtomicNumeric & IAtomicLogical;

(TBD whether add/sub and min/max should be separated from one another)

(Note: I did not make IAtomicLogical inherit from IAtomicNumeric because it is in principle possible to support Atomic<bool>, which would have logical operations but not add/sub/min/max)

It is likely to be rare for programmers to define their own generics that work with Atomic<T> for various T, so putting the burden on them to specify the range of atomic operations they need to be able to perform doesn't seem like too much.

That said, it is clear that my little hierarchy above closely mirrors the kind of hierarchy we need for builtin scalar types, so it is potentially frustrating to have both IWhatever and IAtomicWhatever as distinct interfaces.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used hierarchy of IAtomic to clean this up.

{
[ref] T atomicAnd(T value); // returns original value
[ref] T atomicOr(T value); // returns original value
[ref] T atomicXor(T value); // returns original value
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add atomicIncrement and atomicDecrement for integer types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increment/decrement are interesting in that we can provide them via an extension by just doing an add/sub of 1, but we might in principle want allow targets to directly generate a specific increment/decrement op if they support one.

```

We allow `Atomic<T>` to be defined anywhere: as struct fields, as array elements, as elements of `RWStructuredBuffer` types,
or as groupshared variable types. For example, in global memory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conspicuously doesn't list local variables, function parameters, etc.

Copy link
Collaborator Author

@csyonghe csyonghe Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added mention of local/global variable and function parameter here.


```hlsl
struct MyType
{
int ordinaryValue;
Atomic<int> atomicValue;
}

RWStructuredBuffer<MyType> atomicBuffer;

void main()
{
atomicBuffer[0].atomicValue.atomicAdd(1);
printf("%d", atomicBuffer[0].atomicValue.load());
}
```

In groupshared memory:

```hlsl
void main()
{
groupshared atomic<int> c;
c.atomicAdd(1);
}
```

When generating WGSL code where `atomic<T>` isn't allowed on local variables or other illegal address spaces, we will lower the type
into its underlying type. This should be handled by a legalization pass similar to `lowerBufferElementTypeToStorageType` but operates
in the opposite direction: the "loaded" value from a buffer is converted into an atomic-free type, and storing a value leads to an
atomic store at the corresponding locations.

For non-WGSL/Metal targets, we can simply lower the type out of existence into its underlying type.

# Related Work

`Atomic<T>` type exists in almost all CPU programming languages and is the proven way to express atomic operations over different
architectures that have different memory models. WGSL and Metal follows this trend to require atomic operations being expressed
this way. This proposal is to make Slang follow this trend and make `Atomic<T>` the recommened way to express atomic operation
going forward.