-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make subgroup ops match glsl more closely #32
Conversation
418e290
to
64e2c79
Compare
Is there a way with traits we can flip it? Like adding extension traits: use spirv_std::arch::SubgroupOperations as _;
let float_result = float_val.subgroup_add();
let float_inclusive = float_val.inclusive_subgroup_add();
let float_exclusive = float_val.exclusive_subgroup_add(); or something like: use spirv_std::arch::ops::SubgroupAdd;
let float_result = <f32 as SubgroupAdd>::add(float_val);
let float_inclusive = <f32 as SubgroupAdd>::inclusive_add(float_val);
let float_exclusive = <f32 as SubgroupAdd>::exclusive_add(float_val); Not sure if those are more ergonomic, but I think they can be implemented? |
I'll be honest, I'm not a fan of it ergonomic-wise. Also current atomics are named simiarly, eg. Here's also a small example of shader code using the renamed pub fn subgroup_write_non_uniform(&mut self, t: T) {
let index = unsafe {
let ballot = subgroup_ballot(true);
let count = subgroup_ballot_bit_count(ballot);
let base_index = if subgroup_elect() {
atomic_i_add::<_, { Scope::Device as u32 }, { Semantics::NONE.bits() }>(self.atomic_counter, count)
} else {
0
};
let base_index = subgroup_broadcast_first(base_index);
let inv_index = subgroup_ballot_exclusive_bit_count(ballot);
base_index + inv_index
};
// self.buffer[index] = T;
} |
I agree that is probably better, but bare methods still don't feel very idiomatic to me coming from rust with very little graphics background. Can we do something like a Subgroup struct you can initialize and then call functions on it? I'm thinking how Mutex gives you a guard or how threads can be scoped or not scoped and then joined or how tokio gives you a handle you can later poke at. Subgroups are a stateful "thing", so it feels weird to me to call bare functions to manipulate the thing rather than having the handle to the thing. |
Subgroup intrinsics are not stateful, they are just plain functions that do special stuff. And in theory come with a control barrier across the subgroup, though in practice that barrier just disappears cause invocations within a subgroup should always progress at the same rate (at least on AMD and most other archs, not sure what the new Nvidia cards do). The only state there is is #[derive(Copy, Clone, Default, Eq, PartialEq)]
pub struct SubgroupMask(pub glam::UVec4); |
Subgroups are stateful during a single shader invocation but lose this state once the invocation ends. So for the code running they are stateful, which is what I meant. |
If we have stateful APIs and "Subgroup operations depend on the correct state of invocations... incorrect mask values can result in undefined behavior or synchronization failure" "Ignoring the result of subgroup operations can lead to invalid or non-deterministic outcomes, especially in communication primitives like broadcast." |
I pop'ed off the last commit to make the subgroup functions match the atomic functions more closely ( I don't think it's worth while spending a lot of time on perfecting this API, especially since not a lot of people will be using this extension. I'd rather prioritize getting a nice atomics API going and then adjusting subgroups to match that. And even then I think we got more important things to take care of first. I'd like to get this PR merged in the state it is. But if you have any concrete ideas for a better API, that also doesn't alienate those familiar with the glsl API from the little information out there, feel free to make a new PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can merge this as-is and do some more design in the future.
I wasn't a big fan of how I exposed subgroup ops that require a value from the
GroupOperation
enum. This PR changes all of those functions to be much more like glsl exposes them.These operations:
Got changed to this:
The new function names are almost exactly the same as glsl, except that they have the extra
i
suffix in their name to say that they operate on integers (orf
floats,u
unsigned,s
signed). If anyone got an idea how to unify that, please let me know :D(Note that the asm instruction must be a string literal, so eg. you can't use trait constants to implement it)