Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new warp shuffle intrinsics after CUDA Volta architecture #6505

Merged
merged 6 commits into from
Dec 23, 2021

Conversation

jinderek
Copy link
Contributor

@jinderek jinderek commented Dec 19, 2021

This PR fixes #5630
We need to add ".sync" suffix for warp shuffle intrinsics after Volta architecture:
https://docs.nvidia.com/cuda/volta-tuning-guide/index.html

@jinderek jinderek changed the title Support new warp shuffle intrinsics after CUDA volta architecture Support new warp shuffle intrinsics after CUDA Volta architecture Dec 19, 2021
@abadams
Copy link
Member

abadams commented Dec 19, 2021

Thanks for this, I just had one very minor comment.

@jinderek
Copy link
Contributor Author

jinderek commented Dec 21, 2021

I think the 1 failing of buildbot is not related to this PR. Could you help taking a look? @abadams @jrk

src/LowerWarpShuffles.cpp Outdated Show resolved Hide resolved
@jinderek
Copy link
Contributor Author

Hi @abadams , the code is updated, could you help to have a look? (BTW, I think the 1 failing of buildbot is not related to this PR.)

@abadams
Copy link
Member

abadams commented Dec 23, 2021

Thanks for the PR! Merging.

@abadams abadams merged commit 1d1f06a into halide:master Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

warp shuffle intrinsics no longer work with cuda compute capability 8.0
2 participants