Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WGSL: Intrinsics: atomics #5084

Closed
aleino-nv opened this issue Sep 17, 2024 · 4 comments · Fixed by #5125
Closed

WGSL: Intrinsics: atomics #5084

aleino-nv opened this issue Sep 17, 2024 · 4 comments · Fixed by #5125
Assignees
Labels
goal:client exploration Speculative work for a potential use case.

Comments

@aleino-nv
Copy link
Collaborator

3 functions

@aleino-nv aleino-nv added kind:enhancement a desirable new feature, option, or behavior goal:forward looking Feature needed at a later date, not connected to a specific use case. labels Sep 17, 2024
@aleino-nv aleino-nv self-assigned this Sep 17, 2024
@jkwak-work
Copy link
Collaborator

This task is related to functions listed in the WGSL spec document,
https://www.w3.org/TR/WGSL/#atomic-builtin-functions

@jkwak-work
Copy link
Collaborator

Atomic operations for the global memory variables can be implemented with a workaround that aliases the memory address.

struct VertexOutput {
    @builtin(position) position: vec4<f32>,
};

@group(0) @binding(0)
var<storage> aa  : array<i32,12>;

@group(0) @binding(0)
var<storage, read_write> aa2  : array<atomic<i32>,12>;

@vertex
fn vertex() -> VertexOutput {
	
    atomicAdd(&(aa2[1]),1);
    return VertexOutput(vec4<f32>(1.0));
}

However, the variables in workgroup scope wouldn't be supported as a known limitation, because there is no way for us to translate it from Slang to WGSL due to the strict limitations on WGSL side.

@csyonghe
Copy link
Collaborator

My thinking is that 1) for global memory atomics originating in RWByteAddressBuffer or RWStructuredBuffer<T>, we should create an aliased resource that is RWStructuredBuffer<Atomic<i32/u32>> but at the same group and binding as the original buffer.
This means that in order to represent such a buffer, you need to add IRAtomicType to our IR. (edited)

Then we derive the offset into the atomic array from the index and offset into the original buffer, and write into that atomic element.

This means that the ByteAddressBuffer::InterlockedXXX() or the global InterlockedXXX() function should be calling __wgslAtomicXXX() function for wgsl.

And the __wgslAtomicXXX function should map to kIROp_WGSLAtomicOperation(opName, dest, operands...) slang IR inst.

then we have a pass that identifies all kIROp_WGSLAtomicOperation insts, look at the dest operand, follow any access chain operations on the dest all the way until we see RWStructuredBufferGetElementPtr to know which buffer the address is coming from (byteaddress buffers should be lowered to structured buffers at this point so we don't need to handle them separately).

then we create an aliased buffer for that buffer.

then we compute the offset from the accesschain operations.

and then we modify the WGSLAtomicOperation inst to modify into the corresponding location in the aliased atomic buffer instead.

along the way you will need to use getStd430Offset/getStd430SizeAndAlignment functions to figure out the offset for a struct field.

this should work for global memory atomics. For groupshared atomics, the current decision is to only support atomics on a int/int[]/int[]...[]

If we see kIROp_AtomicOperation on a int or int array in groupshared memory, we convert the array into atomic<i32> or atomic<i32> array.

If dest is not originating from such a source, then we should issue a code-gen time error saying we can't support that general case.

the question then is what if the user passes the array as argument to another function that expects an ordinary array.

there isn't an easy way around other than just make a copy of the callee function and modify the array type into an atomic array there too. There is also the opposite question that is if a callee function with a groupshared array parameter is using atomics on the array, the callee should be converted into taking a atomic array parameter, and all its callsites must be modified to provide that atomic array. if the array argument at callsites can't be modified, it would be an error. This process will be recursive to handle nested calls.

if the entire array is being stored into another array of ordinary ints, we need to make sure that is handled properly as well.

so that's my thinking and I think this should give us decent coverage on atomics.

I suggest we work on global memory atomics first, and worry about groupshared atomics later. Maybe we decide all this automatic rewrites on groupshared arrays is too messy to worth it and we given in and introduce the atomic<T> type in our standard library and deal with all the type system consequences there.

@jkwak-work jkwak-work assigned jkwak-work and unassigned aleino-nv Sep 17, 2024
@bmillsNV bmillsNV assigned csyonghe and unassigned jkwak-work Sep 17, 2024
@csyonghe
Copy link
Collaborator

Update: after discussion with @tangent-vector , the latest decision is to introduce Atomic<t> type, implement it for all backends and for both global and groupshared memory, and start to advertise this as the recommended way to do atomics in Slang. This is the only way to reliably support atomics in all platforms without relying on UB, and to allow separate compilation of modules.

@jkwak-work jkwak-work added goal:client exploration Speculative work for a potential use case. and removed kind:enhancement a desirable new feature, option, or behavior goal:forward looking Feature needed at a later date, not connected to a specific use case. labels Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
goal:client exploration Speculative work for a potential use case.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants