-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add design doc for atomic<T> type. #5101
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
SP #003 - `Atomic<T>` type | ||
============== | ||
|
||
|
||
Status | ||
------ | ||
|
||
Author: Yong He | ||
|
||
Status: Design Discussion. | ||
|
||
Implementation: N/A | ||
|
||
Reviewed by: N/A | ||
|
||
Background | ||
---------- | ||
|
||
HLSL defines atomic intrinsics to work on free references to ordinary values such as `int` and `float`. However, this doesn't translate well to Metal and WebGPU, | ||
which defines `atomic<T>` type and only allow atomic operations to be applied on values of `atomic<T>` types. | ||
|
||
Slang's Metal backend follows the same technique in SPIRV-Cross and DXIL->Metal converter that relies on a C++ undefined behavior that casts an ordinary `int*` pointer to a `atomic<int>*` pointer | ||
and then call atomic intrinsic on the reinterpreted pointer. This is fragile and not guaranteed to work in the future. | ||
|
||
To make the situation worse, WebGPU bans all possible ways to cast a normal pointer into an `atomic` pointer. In order to provide a truly portable way to define | ||
atomic operations and allow them to be translatable to all targets, we will also need an `atomic<T>` type in Slang that maps to `atomic<T>` in WGSL and Metal, and maps to | ||
`T` for HLSL/SPIRV. | ||
|
||
|
||
Proposed Approach | ||
----------------- | ||
|
||
We define an `Atomic<T>` type that functions as a wrapper of `T` and provides atomic operations: | ||
``` | ||
interface IAtomicable {} | ||
extension int : IAtomicable {} | ||
extension uint : IAtomicable {} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We may want to add 64-bit integer since SM 6.6 supports it. |
||
extension float : IAtomicable {} | ||
extension half : IAtomicable {} | ||
|
||
struct Atomic<T> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why isn't there a constraint of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added. |
||
{ | ||
T load(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you intend to support syntactic sugar where you can use the Atomic as an lvalue/rvalue and assign without using load/store? This is pretty common. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes we will need that. |
||
[ref] void store(T newValue); // Question: do we really need this? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, you surely want this, atomic stores can often be lowered to a much cheaper hardware instruction than whatever you would replace it with (e.g. atomic exchange). |
||
[ref] T exchange(T newValue); // returns old value | ||
[ref] T compareExchange(T compareValue, T newValue); // returns old value. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are sometimes two forms of compareExchange (strong vs weak). Should say which this is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to SPIRV spec, the OpAtomicCompareExchangeWeak is deprecated and we only have OpAtomicCompareExchange, so perhaps it is fine to just assume it means weak here? |
||
[ref] T atomicAdd(T value); // returns original value | ||
[ref] T atomicSub(T value); // returns original value | ||
[ref] T atomicMax(T value); // returns original value | ||
[ref] T atomicMin(T value); // returns original value | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In principle, the only truly universal operations that an atomicable type would need to support are load/store and exchange. Compare-exchange relies on the type being comparable (and I'd need to double-check how comparisons are done for floating-point atomic compare-exchange), and the add/sub/min/max operations rely on the type supporting the relevant mathematical operations. It seems like we might actually need a hierarchy of atomic-related interfaces, representing specific subsets of the available functionality. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added IAtomicable hierarchy. |
||
} | ||
|
||
extension<T> Atomic<T> | ||
where T : IAtomicable | ||
where T : __BuiltinIntegerType | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I get what this is doing, but I also worry a little bit, when I think about how this would translate to my "move the required operations into the interface" approach. Effectively, this is saying that any type that conforms I really do think the right answer here will be a hierarchy of interfaces, along the lines of:
(TBD whether add/sub and min/max should be separated from one another) (Note: I did not make It is likely to be rare for programmers to define their own generics that work with That said, it is clear that my little hierarchy above closely mirrors the kind of hierarchy we need for builtin scalar types, so it is potentially frustrating to have both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Used hierarchy of IAtomic to clean this up. |
||
{ | ||
[ref] T atomicAnd(T value); // returns original value | ||
[ref] T atomicOr(T value); // returns original value | ||
[ref] T atomicXor(T value); // returns original value | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can add atomicIncrement and atomicDecrement for integer types. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Increment/decrement are interesting in that we can provide them via an |
||
``` | ||
|
||
We allow `Atomic<T>` to be defined anywhere: as struct fields, as array elements, as elements of `RWStructuredBuffer` types, | ||
or as groupshared variable types. For example, in global memory: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This conspicuously doesn't list local variables, function parameters, etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added mention of local/global variable and function parameter here. |
||
|
||
```hlsl | ||
struct MyType | ||
{ | ||
int ordinaryValue; | ||
Atomic<int> atomicValue; | ||
} | ||
|
||
RWStructuredBuffer<MyType> atomicBuffer; | ||
|
||
void main() | ||
{ | ||
atomicBuffer[0].atomicValue.atomicAdd(1); | ||
printf("%d", atomicBuffer[0].atomicValue.load()); | ||
} | ||
``` | ||
|
||
In groupshared memory: | ||
|
||
```hlsl | ||
void main() | ||
{ | ||
groupshared atomic<int> c; | ||
c.atomicAdd(1); | ||
} | ||
``` | ||
|
||
When generating WGSL code where `atomic<T>` isn't allowed on local variables or other illegal address spaces, we will lower the type | ||
into its underlying type. This should be handled by a legalization pass similar to `lowerBufferElementTypeToStorageType` but operates | ||
in the opposite direction: the "loaded" value from a buffer is converted into an atomic-free type, and storing a value leads to an | ||
atomic store at the corresponding locations. | ||
|
||
For non-WGSL/Metal targets, we can simply lower the type out of existence into its underlying type. | ||
|
||
# Related Work | ||
|
||
`Atomic<T>` type exists in almost all CPU programming languages and is the proven way to express atomic operations over different | ||
architectures that have different memory models. WGSL and Metal follows this trend to require atomic operations being expressed | ||
this way. This proposal is to make Slang follow this trend and make `Atomic<T>` the recommened way to express atomic operation | ||
going forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a problem to solve with this proposal, but in the long run we really ought to stop using empty interfaces for stuff like this, and then relying on intrinsics that are constrained on the interface.
Ideally, built-in interfaces like this should actually define their requirements explicitly, like:
We would then make the
Atomic<T>
type have entirely concrete method definitions (with hints to ensure they get inlined whenever possible), and the conformances for concrete types likeint
anduint
would then define the required operations explicitly (whether as intrinsics, or usingtarget_switch
, etc.).Such an approach would not only be more "correct," but it also opens the door to having certain types conform in non-intrinsic ways, or even allowing user-defined conformances (e.g., to allow a user-defined type to opt into
Atomic<T>
support if its in-memory layout is bit-identical to a built-in atomic type).