Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include GPU source kernels in Stmt and StmtHtml file. #6444

Merged
merged 1 commit into from
Nov 24, 2021

Conversation

mcourteaux
Copy link
Contributor

This works for me nicely with CUDA, OpenCL, OpenGLCompute, Direct3D12Compute, and Metal as backends. I tried collapsing the code for all of them and it works nicely. I put everything in a <pre> for the HTML output. Not sure if any of the backends produce non-compatible characters that might break the HTML, but for my kernels, it didn't.

@mcourteaux
Copy link
Contributor Author

This mostly fixes #6410.

@abadams
Copy link
Member

abadams commented Nov 24, 2021

Thanks! To me this seems like a reasonably elegant solution. It might be better if we tracked embedded data buffers and embedded kernel source buffers in separate vectors in the Module class so that we didn't print based on a suffix of the name, but I'm happy with the PR as-is if you don't want to do that.

Also looks like it needs a clang-format run.

@mcourteaux
Copy link
Contributor Author

mcourteaux commented Nov 24, 2021

I don't think it's easy to do, as the Buffers get added to the Module here:

Halide/src/Lower.cpp

Lines 467 to 471 in 8b68f85

if (arg.buffer.defined() && !found) {
// It's a raw Buffer used that isn't in the args
// list. Embed it in the output instead.
debug(1) << "Embedding image " << arg.buffer.name() << "\n";
result_module.append(arg.buffer);

And the only thing we have left is the fact that it's a Buffer, and not what it's used for, as far as I can tell at least.

For the clang-format, I ran clang-format, and force pushed. Not sure if that's the way things go here (wrt to the CI), but I'd liked to see one commit for this instead of two 😋

@abadams
Copy link
Member

abadams commented Nov 24, 2021

Oh right, OffloadGPULoops just adds a reference to it and assumes it'll get picked up. I guess it would have to be a flag on the buffer, which would be a little gross.

Probably inject_gpu_offload should take the module and add the kernel source buffers to it directly, but then you'd have to make sure they don't redundantly get added again...

Let's not deal with it in this PR.

@abadams
Copy link
Member

abadams commented Nov 24, 2021

Oh, and for clang-format we normally push a second commit rather than force-pushing, and then squash the commits into one when we merge.

@abadams abadams self-requested a review November 24, 2021 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants