New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Relocatability of `deferred_codegen` #581

Open

vchuravy opened this issue May 13, 2024 · 0 comments

Member

vchuravy commented May 13, 2024

Currently deferred_codegen pushes a Job into a runtime dictionary. That runtime dictionary is only valid during the session.

I am wondering if instead we could use a de-virtualization strategy similar to Base.

@noinline function gpuc_deferred(f, args...)::Ptr end
@noinline function gpuc_lookup(mi, f, args)::Ptr end

Adding an abstract interpretation extension to refine gpuc_deferred -> gpuc_lookup looking up the corresponding mi.

After codegen (since we can't customize that yet) we scan the LLVM IR for gpuc_lookup and codegen the corresponding functions into the same module.

CUDA wants to get a function pointer and then wrap that in a CuDeviceFunction https://github.com/JuliaGPU/CUDA.jl/blob/e9928ca84509d7c686ea7ec413e1ad2d8176b987/src/compiler/execution.jl#L417

The text was updated successfully, but these errors were encountered:

vchuravy mentioned this issue

New deferred_codegen implementation #582

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment