Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relocatability of deferred_codegen #581

Open
vchuravy opened this issue May 13, 2024 · 0 comments
Open

Relocatability of deferred_codegen #581

vchuravy opened this issue May 13, 2024 · 0 comments

Comments

@vchuravy
Copy link
Member

Currently deferred_codegen pushes a Job into a runtime dictionary. That runtime dictionary is only valid during the session.

I am wondering if instead we could use a de-virtualization strategy similar to Base.

@noinline function gpuc_deferred(f, args...)::Ptr end
@noinline function gpuc_lookup(mi, f, args)::Ptr end

Adding an abstract interpretation extension to refine gpuc_deferred -> gpuc_lookup looking up the corresponding mi.

After codegen (since we can't customize that yet) we scan the LLVM IR for gpuc_lookup and codegen the corresponding functions into the same module.

CUDA wants to get a function pointer and then wrap that in a CuDeviceFunction https://github.com/JuliaGPU/CUDA.jl/blob/e9928ca84509d7c686ea7ec413e1ad2d8176b987/src/compiler/execution.jl#L417

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant