Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly name the allocgroups on GPU schedules "allocgroup__..." #7883

Merged
merged 2 commits into from
Oct 12, 2023

Conversation

mcourteaux
Copy link
Contributor

I was flabbergasted when inspecting the schedule using the conceptual_stmt of a GPU schedule. The loads and stores didn't make any sense to me. After digging through the lowering passes, I found out about the AllocGroups and the clustering in the FuseGPUThreadLoops pass.

I prepended "allocgroup_" to the name of the Allocate node now, to make that clearer in the Stmt.

I still have another concern with this, and don't know what to do about it. After working on a schedule for a while, the names of these AllocGroups get really long, like:

allocate allocgroup__blurring_filters_noisy_im_in_denoise_conv_noisy$0.0_exp_logit.4__clamped_noisy_in_denoise_conv_noisy$0.1_filter_response_noisy_global_wrapper$0.3_sum_exp_logit_global_wrapper$0.6__denoise_conv_noisy_global_wrapper$0.2[float32 * (t1386 + 768)] in GPUShared

Which is anything but readable, especially in code like this:
image

I was thinking that it might be useful to just rename the allocations to allocgroup_0, allocgroup_1, etc... and somehow keep the list of shared allocations in this allocation group as meta information of the Allocate IR node, which can then be used to produce Stmt code with comments on what is contained within this grouped alloc.

@abadams
Copy link
Member

abadams commented Oct 9, 2023

Is this ready to merge? (It's still marked as a draft). IMO even if a better solution comes along for very long merged allocation names we should merge it because it's a small PR that makes a positive change.

@mcourteaux mcourteaux marked this pull request as ready for review October 9, 2023 19:07
@mcourteaux
Copy link
Contributor Author

Okay, go ahead! Marked as ready.

@mcourteaux mcourteaux marked this pull request as draft October 9, 2023 19:10
@mcourteaux
Copy link
Contributor Author

Wait, actually... I discovered today that it still prepends "allocgroup_" even if the group is of size 1. I'll see if I can squeeze that out.

@mcourteaux mcourteaux marked this pull request as ready for review October 9, 2023 19:38
@mcourteaux
Copy link
Contributor Author

mcourteaux commented Oct 9, 2023

Okay, I did it. I was a little surprised by the double layered hierarchy of grouping: SharedAllocations go into a AllocGroup, which are then again clusterd into the clustered_allocs std::map. However, I think I did it right.

@abadams
Copy link
Member

abadams commented Oct 9, 2023

Here's the reason for the double-layered hierarchy (IIRC): Sets of allocations with non-overlapping lifetime go into an AllocGroup, the size of which is the max across all the allocations in the group. Then those groups are concatenated into a single big allocation, the size of which is the sum across all the groups.

@mcourteaux
Copy link
Contributor Author

Are we merging this? It's ready, if you ask me. 😄

@abadams abadams merged commit a3911bb into halide:main Oct 12, 2023
17 of 19 checks passed
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
…alide#7883)

* 50cents readibility improvement to allocgroups on GPU schedules.

* Improve allocation group prefix: only if the alloc group cluster contains more than 1 allocation prepend the prefix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants