Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Add option for no gpu.block_dim in GPUDistributeScfFor #17214

Merged
merged 3 commits into from
May 1, 2024

Conversation

Max191
Copy link
Contributor

@Max191 Max191 commented Apr 29, 2024

This PR adds an option to generate arith.constant ops with workgroup sizes in place of gpu.block_dim ops in GPUDistributeScfFor. The gpu.block_dim ops are not currently handled properly on ROCM backends, so this option is needed to support GPUDistributeScfFor on the LLVMGPU path.

@Max191
Copy link
Contributor Author

Max191 commented Apr 29, 2024

I will create an issue later about the gpu.block_dim on ROCM. It will be easier to create a repro once the LLVMGPU winograd pipeline is landed.

@@ -38,13 +42,30 @@ struct DistributeLoop final : public OpRewritePattern<scf::ForOp> {
if (!numDimAttr)
return failure();

auto funcOp = forOp->getParentOfType<FunctionOpInterface>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks are only neccessary and to fail when useBlockDims is false?

// CHECK: %[[YLB:.+]] = affine.apply affine_map<()[s0, s1, s2] -> (s0 * s1 + s2)>()[%[[ID]], %[[STEP]], %[[LB]]]
// CHECK: %[[YSTEP:.+]] = affine.apply affine_map<()[s0, s1] -> (s0 * s1)>()[%[[DIM]], %[[STEP]]]
// CHECK: scf.for %[[IV:.+]] = %[[YLB]] to %[[UB]] step %[[YSTEP]] {
// CHECK: memref.store %{{.+}}, %{{.+}}[%[[IV]]] : memref<?xf32>

// NO-BLOCK-DIM-LABEL: func.func @distribute_to_y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are duplicated with the above in the sense they don't check additonal stuff? It's fine to just have one of them and remove the others--saves us efforts to update all of them later.

@Max191 Max191 force-pushed the gpu-distribute-no-block-dim branch from 9075fed to 3d10c31 Compare May 1, 2024 13:43
@Max191 Max191 requested a review from antiagainst May 1, 2024 13:43
@Max191 Max191 merged commit e633d07 into iree-org:main May 1, 2024
55 checks passed
LLITCHEV pushed a commit to LLITCHEV/iree that referenced this pull request Jul 30, 2024
…g#17214)

This PR adds an option to generate `arith.constant` ops with workgroup
sizes in place of `gpu.block_dim` ops in `GPUDistributeScfFor`. The
`gpu.block_dim` ops are not currently handled properly on ROCM backends,
so this option is needed to support `GPUDistributeScfFor` on the LLVMGPU
path.

Signed-off-by: Lubo Litchev <lubol@google.com>
@Max191 Max191 deleted the gpu-distribute-no-block-dim branch October 25, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants