-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable Gpu address spaces #10884
enable Gpu address spaces #10884
Conversation
Tagging @Vexu who reviewed the previous PR (I'm not able to assign reviewers, not sure if it's intended or not) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
One thing you might also want to implement is the logic for obtaining the default address space in a given context for this architecture: Lines 598 to 614 in f506810
Currently this just returns the generic address space, but since this is not allowed for SPIR-V, the idea was to assign a default address space for each context. This would for example set const x = y; at global scope to constant , and function-local variables to local . Do you happen to know if theres a big difference between the performance of different address spaces? From the LLVM NVPTX Docs it looks like the the default address space is generic, and I assume this is what CUDA also uses, but I'm not very sure about that. If CUDA instead makes function-locals local (address space 5) instead, it might be best to replicate that in Zig.
|
So when you dereference a generic pointer, the GPU runtime need to first identify what kind of pointer it is before dereferencing. The slowest memory are local and global, then constant memory is a bit faster, then shared memory is significantly faster and finally registers is the fastest memory.
I've tried based on your recommendation, but I think it will require more work. It tend to generate compile errors when
I've added the commit for reference, but will probably rollback since it's seems mostly useless for now: Lines 604 to 614 in aa7acf0
What would be useful would be to mark kernel parameters that are pointers as pointers to global memory, but I haven't found a way to do so. I'm also not sure if we should expose the |
Hm, maybe need to experiment a bit more with that. Let's address that another time.
Do you mean automatically? Perhaps it could be inferred based on the function calling convention, I am having second thoughts about the whole "automatically inferring address spaces" thingy anyway. Perhaps things should remain more explicit? In the explicit (and current) case it would look like export fn kernel(ptr: *addrspace(.global) u32) callconv(.PtxKernel) void {
...
}
Hm i see. The AMDGPU backend also exposes a |
06941f6
to
632e147
Compare
PTX does also have |
So PTX can initialize .global and .constant. It seems to also be possible with AMDGPU, but it's a bit harder. relevant ptx documentation |
Thanks for the insightful exchanges Robin, Time to go back making more examples of GPU Zig programming :-) |
Follow up on #10189 , this enable GPU specific address spaces leveraging previous work from @Snektron in #9649.
Right now this is directly mapping to the LLVM address spaces, but with Snektron we didn't found good reason to deviate from this and we believe it should be compatible across different architectures.
Note: this is still experimental feature because Zig doesn't have any officially supported GPU target.