Include GPU source kernels in Stmt and StmtHtml file. #6444

mcourteaux · 2021-11-24T10:44:05Z

This works for me nicely with CUDA, OpenCL, OpenGLCompute, Direct3D12Compute, and Metal as backends. I tried collapsing the code for all of them and it works nicely. I put everything in a <pre> for the HTML output. Not sure if any of the backends produce non-compatible characters that might break the HTML, but for my kernels, it didn't.

mcourteaux · 2021-11-24T10:50:12Z

This mostly fixes #6410.

abadams · 2021-11-24T16:04:04Z

Thanks! To me this seems like a reasonably elegant solution. It might be better if we tracked embedded data buffers and embedded kernel source buffers in separate vectors in the Module class so that we didn't print based on a suffix of the name, but I'm happy with the PR as-is if you don't want to do that.

Also looks like it needs a clang-format run.

mcourteaux · 2021-11-24T16:23:37Z

I don't think it's easy to do, as the Buffers get added to the Module here:

Halide/src/Lower.cpp

Lines 467 to 471 in 8b68f85

    
           if (arg.buffer.defined() && !found) { 
        
               // It's a raw Buffer used that isn't in the args 
        
               // list. Embed it in the output instead. 
        
               debug(1) << "Embedding image " << arg.buffer.name() << "\n"; 
        
               result_module.append(arg.buffer);

And the only thing we have left is the fact that it's a Buffer, and not what it's used for, as far as I can tell at least.

For the clang-format, I ran clang-format, and force pushed. Not sure if that's the way things go here (wrt to the CI), but I'd liked to see one commit for this instead of two 😋

abadams · 2021-11-24T16:59:18Z

Oh right, OffloadGPULoops just adds a reference to it and assumes it'll get picked up. I guess it would have to be a flag on the buffer, which would be a little gross.

Probably inject_gpu_offload should take the module and add the kernel source buffers to it directly, but then you'd have to make sure they don't redundantly get added again...

Let's not deal with it in this PR.

abadams · 2021-11-24T17:01:10Z

Oh, and for clang-format we normally push a second commit rather than force-pushing, and then squash the commits into one when we merge.

abadams requested a review from halidebuildbots November 24, 2021 16:01

Include GPU source kernels in Stmt and StmtHtml file.

54f04f7

mcourteaux force-pushed the master branch from 3b3ef6e to 54f04f7 Compare November 24, 2021 16:22

abadams self-requested a review November 24, 2021 20:59

abadams approved these changes Nov 24, 2021

View reviewed changes

abadams merged commit 3bde22a into halide:master Nov 24, 2021

abadams mentioned this pull request Nov 24, 2021

How to output NVPTX assembly/IR/bytecode? #6410

Closed

mcourteaux mentioned this pull request Aug 21, 2023

stmt and stmt_html output are too low level #7519

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include GPU source kernels in Stmt and StmtHtml file. #6444

Include GPU source kernels in Stmt and StmtHtml file. #6444

mcourteaux commented Nov 24, 2021

mcourteaux commented Nov 24, 2021

abadams commented Nov 24, 2021

mcourteaux commented Nov 24, 2021 •

edited

Loading

abadams commented Nov 24, 2021 •

edited

Loading

abadams commented Nov 24, 2021

Include GPU source kernels in Stmt and StmtHtml file. #6444

Include GPU source kernels in Stmt and StmtHtml file. #6444

Conversation

mcourteaux commented Nov 24, 2021

mcourteaux commented Nov 24, 2021

abadams commented Nov 24, 2021

mcourteaux commented Nov 24, 2021 • edited Loading

abadams commented Nov 24, 2021 • edited Loading

abadams commented Nov 24, 2021

mcourteaux commented Nov 24, 2021 •

edited

Loading

abadams commented Nov 24, 2021 •

edited

Loading