We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deferred_codegen
Currently deferred_codegen pushes a Job into a runtime dictionary. That runtime dictionary is only valid during the session.
I am wondering if instead we could use a de-virtualization strategy similar to Base.
@noinline function gpuc_deferred(f, args...)::Ptr end @noinline function gpuc_lookup(mi, f, args)::Ptr end
Adding an abstract interpretation extension to refine gpuc_deferred -> gpuc_lookup looking up the corresponding mi.
gpuc_deferred
gpuc_lookup
mi
After codegen (since we can't customize that yet) we scan the LLVM IR for gpuc_lookup and codegen the corresponding functions into the same module.
CUDA wants to get a function pointer and then wrap that in a CuDeviceFunction https://github.com/JuliaGPU/CUDA.jl/blob/e9928ca84509d7c686ea7ec413e1ad2d8176b987/src/compiler/execution.jl#L417
CuDeviceFunction
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Currently
deferred_codegen
pushes a Job into a runtime dictionary. That runtime dictionary is only valid during the session.I am wondering if instead we could use a de-virtualization strategy similar to Base.
Adding an abstract interpretation extension to refine
gpuc_deferred
->gpuc_lookup
looking up the correspondingmi
.After codegen (since we can't customize that yet) we scan the LLVM IR for
gpuc_lookup
and codegen the corresponding functions into the same module.CUDA wants to get a function pointer and then wrap that in a
CuDeviceFunction
https://github.com/JuliaGPU/CUDA.jl/blob/e9928ca84509d7c686ea7ec413e1ad2d8176b987/src/compiler/execution.jl#L417The text was updated successfully, but these errors were encountered: