Skip to content

Remove unsafe from unconditionally safe subgroup ops #306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

liamwhite
Copy link

@liamwhite liamwhite commented Jun 29, 2025

Fixes #305. Here is what I decided to allow, and the justification for each.

Change Justification
OpControlBarrier Spec says: "When Execution is Workgroup or larger, behavior is undefined [...]" Execution is Subgroup.
OpMemoryBarrier Memory barriers are always defined.
OpGroupNonUniformElect Subgroup election is always defined.
OpGroupNonUniformAll Subgroup predicate evaluation is always defined.
OpGroupNonUniformAny Subgroup predicate evaluation is always defined.
OpGroupNonUniformAllEqual Subgroup predicate evaluation is always defined.
OpGroupNonUniformBroadcastFirst Subgroup broadcast first is always defined.
OpGroupNonUniformBallot Subgroup ballot is always defined.
OpGroupNonUniformBallotBitExtract Subgroup bit extract is safe. The result of subgroup bit extract is undefined for Index exceeding scope size.
OpGroupNonUniformBallotBitCount Subgroup ballot bit count is always defined.
OpGroupNonUniformBallotFindLSB Subgroup ballot find-LSB is safe. The result of subgroup ballot find-LSB is undefined if no considered bits are set to 1.
OpGroupNonUniformBallotFindMSB Subgroup ballot find-MSB is safe. The result of subgroup ballot find-MSB is undefined if no considered bits are set to 1.
OpGroupNonUniformShuffle Subgroup shuffle is safe. The result of subgroup shuffle is undefined for inactive or exceeding scope size Id.
OpGroupNonUniformShuffleXor Subgroup shuffle XOR is safe. The result of subgroup shuffle XOR is undefined for inactive or exceeding scope size invocation id XOR mask.
OpGroupNonUniformShuffleUp Subgroup shuffle up is safe. The result of subgroup shuffle up is undefined for Delta greater than invocation id.
OpGroupNonUniformShuffleDown Subgroup shuffle down is safe. The result of subgroup shuffle down is undefined for Delta greater than or equal to size of subgroup.
OpGroupNonUniformIAdd Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformFAdd Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformIMul Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformFMul Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformSMin Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformUMin Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformFMin Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformSMax Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformUMax Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformFMax Non-clustered subgroup arithmetic is always defined.
OpGroupNonUniformBitwiseAnd Non-clustered subgroup bit operations are always defined.
OpGroupNonUniformBitwiseOr Non-clustered subgroup bit operations are always defined.
OpGroupNonUniformBitwiseXor Non-clustered subgroup bit operations are always defined.
OpGroupNonUniformLogicalAnd Non-clustered subgroup boolean operations are always defined.
OpGroupNonUniformLogicalOr Non-clustered subgroup boolean operations are always defined.
OpGroupNonUniformLogicalXor Non-clustered subgroup boolean operations are always defined.
OpGroupNonUniformQuadSwap Subgroup quad swap horizontal/vertical/diagonal operations are safe. The result of subgroup quad swap is undefined if an active invocation reads from an inactive invocation.

@Firestar99
Copy link
Member

Since I copied over the spirv spec for each function, it's actually surprisingly simple to just Ctrl+F for undefined in the file itself. Then you just need to differentiate between undefined behavior and undefined result.

I do agree that undefined result should be considered safe.

I'd love to see a # Safety section to each unsafe function remaining, summarizing the unsafety, since it's sort of all over the place in the docs. Don't mind if I just add that myself, as I review your changes.

Copy link
Member

@Firestar99 Firestar99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned up the docs a bit and added a safety section everywhere except cluster ops. I want to try to make them safe in a followup PR with some const assertions. Added it here anyway

Also discovered that subgroup_quad_broadcast is also safe, just the result may be undefined.

Ready to merge from my side

@Firestar99
Copy link
Member

I decided to add the custer ops safety into this PR. It needs to remain unsafe due to one precondition that we cannot statically verify, as group size is defined by the device:

const {
    assert!(CLUSTER_SIZE >= 1, "`ClusterSize` must be at least 1");
    assert!(
        CLUSTER_SIZE.is_power_of_two(),
        "`ClusterSize` must be a power of 2"
    );
    // Cannot be verified with static assertions:
    // `ClusterSize` must not be greater than the size of the group
}

@Firestar99
Copy link
Member

I'd love if someone else from the team has a quick read over this one. @LegNeato ?

@liamwhite
Copy link
Author

liamwhite commented Jun 30, 2025

@Firestar99 it should be pretty simple to generate safe wrappers from the same macro if you think it's worth including. (checked_subgroup_clustered_i_add -> Option<I>)

E: Scratch the idea for safe wrappers. This can definitely be enforced in a way that makes it unconditionally safe because any cluster size used by the spv module has to be known at compile time (it's a const generic) alongside the enabling capabilities (GroupNonUniform*). The module can't be loaded if the enabling capabilities aren't supported by the target device. This could be extended to subgroup size constraints too. But I don't know if the infrastructure is there for that.

@Firestar99
Copy link
Member

First, I'd merge this as is and move any further enhancements to a new PR.

ClusterSize must not be greater than the size of the group

I'm reading "size of the group" to be equivalent to the subgroupSize property of the device, which is device dependent obviously. So there's no way you can check this at compile time. (There's devices with varying subgroup size, so Vulkan 1.3 has min and max)

This could be extended to subgroup size constraints too

I don't know if you'd want to have a fix subgroup size defined in the shader. That said, I'm not experienced with how people use clustered ops, I haven't found a good use-case for them yet.

@liamwhite
Copy link
Author

liamwhite commented Jun 30, 2025

First, I'd merge this as is and move any further enhancements to a new PR.

Agreed

I don't know if you'd want to have a fix subgroup size defined in the shader.

It's not that it'd be constant, it'd be "module requires subgroupSize >= X"

@charles-r-earp
Copy link

https://registry.khronos.org/vulkan/specs/latest/man/html/SubgroupSize.html

The SubgroupSize builtin is required to match the subgroupSize property (on the host), until 1.6 or with additional extensions / options. Intel and Apple can use smaller subgroups than reported (both in the shader and on the host). Intel supports subgroup_size_control, so it can be set to a known value.

The number of active threads within a subgroup can be queried by a subgroup add op. While sound, this isn't zero cost.

I would probably leave clustered ops as unsafe unless there's motivation.

It is difficult to use subgroup ops without a known size (constant or spec constant), just like many algorithms may require a specific workgroup size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

All subgroup operations are unsafe
3 participants