Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT] Active block check in -fblockfusion_level=2 #50

Open
xysmlx opened this issue Sep 29, 2020 · 1 comment
Open

[ENHANCEMENT] Active block check in -fblockfusion_level=2 #50

xysmlx opened this issue Sep 29, 2020 · 1 comment
Assignees

Comments

@xysmlx
Copy link
Contributor

xysmlx commented Sep 29, 2020

🚀 Feature
Check GridDim in -fblockfusion_level=2 to satisfy the active block limitation in CUDA.

Motivation
BlockFusion with -fblockfusion_level=2 uses inter-block synchronization primitives. Improper number of BEs (vEUs) may lead to deadlock due to the active block limitation in CUDA.

Pitch
We can use nvcc to check the GridDim after blockfusion codegen and adaptively change the number of BEs (vEUs) to satisfy the active block limitation in CUDA.

Alternatives
Fallback to -fblockfusion_level=1 when the GridDim exceeds the active block limitation. The overhead of inter-block synchronization is becoming larger with the increasing of blocks.

Additional context

@xysmlx xysmlx added the enhancement New feature or request label Sep 29, 2020
@xysmlx xysmlx self-assigned this Sep 29, 2020
@nnfbot
Copy link

nnfbot commented Sep 29, 2020

Thanks for the report @xysmlx! I will look into it ASAP! (I'm a bot).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants