-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export kernel descriptor for amdgpu kernels #135909
Conversation
Could not assign reviewer from: |
rustbot has assigned @petrochenkov. Use |
I wonder if we should require I'm not sold on that tho'. Well, in any case, can you add a comment about that? (or implement it, either one) |
@rustbot label +S-waiting-on-author -S-waiting-on-review |
Requiring Independent of that, this PR needs a test once the target is merged. |
cool. |
blocked on #134740 |
Congratulations! It landed! Your reward is more toil. Alas, the futility of humankind. :^) |
Hehe 😄 I added the test that I mentioned in the PR description (and forgot about before). Diff |
This comment has been minimized.
This comment has been minimized.
This PR modifies cc @jieyouxu |
Added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine modulo a thing
And jieyouxu being happy ofc |
Thanks for the quick review, I fixed both comments (diff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny test nit, test LGTM otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One test nit re. a bit more context then r=me on the test. You can also flip this PR to mark as ready-for-view via @rustbot ready
.
The host runtime (HIP or HSA) expects a kernel descriptor object for each kernel in the ELF file. The amdgpu LLVM backend generates the object. It is created as a symbol with the name of the kernel plus a `.kd` suffix. Add it to the exported symbols in the linker script, so that it can be found.
@workingjubilee to double-check, are you happy w/ the current iteration of changes? |
Yes. @bors r=jieyouxu,workingjubilee |
Rollup of 5 pull requests Successful merges: - rust-lang#135797 (Import initial generated 1.85 relnotes) - rust-lang#135909 (Export kernel descriptor for amdgpu kernels) - rust-lang#136545 (nvptx64: update default alignment to match LLVM 21) - rust-lang#137092 (abi_unsupported_vector_types: say which type is the problem) - rust-lang#137097 (Ignore Self in bounds check for associated types with Self:Sized) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#135909 - Flakebi:amdgpu-kd, r=jieyouxu,workingjubilee Export kernel descriptor for amdgpu kernels The host runtime (HIP or HSA) expects a kernel descriptor object for each kernel in the ELF file. The amdgpu LLVM backend generates the object. It is created as a symbol with the name of the kernel plus a `.kd` suffix. Add it to the exported symbols in the linker script, so that it can be found. For reference, the symbol is created here in LLVM: https://github.com/llvm/llvm-project/blob/d5457e4c1619e5dbeefd49841e284cbc24d35cb4/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp#L966 I wrote [a test](Flakebi@6a9115b) for this as well, I’ll add that once the target is merged and working. With this, all PRs to get working code for amdgpu are open (this + the target + the two patches adding addrspacecasts for alloca and global variables). Tracking issue: rust-lang#135024 r? `@workingjubilee`
The host runtime (HIP or HSA) expects a kernel descriptor object for each kernel in the ELF file. The amdgpu LLVM backend generates the object. It is created as a symbol with the name of the kernel plus a
.kd
suffix.Add it to the exported symbols in the linker script, so that it can be found.
For reference, the symbol is created here in LLVM: https://github.com/llvm/llvm-project/blob/d5457e4c1619e5dbeefd49841e284cbc24d35cb4/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp#L966
I wrote a test for this as well, I’ll add that once the target is merged and working.
With this, all PRs to get working code for amdgpu are open (this + the target + the two patches adding addrspacecasts for alloca and global variables).
Tracking issue: #135024
r? @workingjubilee