Skip to content

Conversation

@nnethercote
Copy link
Collaborator

NvvmArch:all_target_features has some bugs. This PR fixes them. Details in the individual commits.

- Improve a few comments.
- Combine three `nvvm_arch_target_feature_format_*` tests into one.
- Avoid some unnecessary local variables.
- Rename `nvvm_arch_all_target_features_includes_lower_capabilities`,
  and order things more sensibly within it.
- Remove
  `target_feature_synthesis_supports_conditional_compilation_patterns`
  and `target_feature_synthesis_enables_correct_cfg_patterns` and
  `nvvm_arch_a_suffix_includes_all_available_instructions`. They are all
  testing the same things as `nvvm_arch_all_target_features`. (A couple
  of things from them were moved into `nvvm_arch_all_target_features`,
  to preserve test coverage.)

This all just makes things more streamlined and avoids repetition, to
get ready for subsequent changes.
Currently some `all_target_features` results are compared against a
`Vec` literal, while some are only tested with a smattering of
`contains`/`!contains` calls.

This commit changes the latter one to use `Vec` literals. This is more
thorough, easier to read, and demonstrates two existing bugs with
`all_target_features`:
- The values for the 'f' suffix ones are badly wrong.
- The sort order is lexicographic, which puts `compute_100` before
  `compute_90`.

The next commit will fix these bugs.
It now does a single filter pass over the enum variants, which is
simpler and fixes the sorting issue and the incorrect 'f' suffix
results.

I removed some comments in the `nvvm_arch_all_target_features` test,
because they were low-value. There are now better comments within
`all_target_features` that explain what's happening.

I also remove the comment about PTX forward-compatibility. It was
correct but confusing. This function answers the question "what features
are available if I'm targeting a particular NvvmArch?" (backwards
compatibility). That comment explained "what GPU CCs will this run on?"
(forward compatibility).

Also update the relevant section in the guide, where the 'f' details
were incorrect. And make the terminology more consistent.
Several of them sort or inspect their vectors unnecessarily. We can just
do `Vec` literal equality tests.
It's standard to have `use super::*;` in a test module, and this lets us
remove a bunch of existing `use` items.
@nnethercote
Copy link
Collaborator Author

Like in #305, "no space left on device" for the x86-64/Linux runs. Weird.

@nnethercote nnethercote mentioned this pull request Nov 18, 2025
@nnethercote nnethercote requested a review from LegNeato November 19, 2025 02:24
Copy link
Contributor

@LegNeato LegNeato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think you convinced me this is correct.

@LegNeato LegNeato merged commit 1eca961 into Rust-GPU:main Nov 19, 2025
5 of 8 checks passed
@nnethercote nnethercote deleted the NvvmArch-fixes branch November 19, 2025 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants