Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Shared AtPoints Bases #1711

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

GPU Shared AtPoints Bases #1711

wants to merge 3 commits into from

Conversation

jeremylt
Copy link
Member

@jeremylt jeremylt commented Nov 15, 2024

This PR adds AtPoints to /gpu/[cuda,hip]/shared, which is a blocker for a /gpu/[cuda,hip]/gen AtPoints capability.

Work in progress. The t35* series tests pass, but the t59* series don't all pass.

There seems to be an issue with the 2D/3D transpose interp and grad with multiple elements.

Passing for libCEED t* tests. petsc-bpsswarm isn't passing yet though. There is also some issue with a few Ratel tests where they get stuck, probably on a __syncthreads(). (MMS tests it seems).

@jeremylt jeremylt self-assigned this Nov 15, 2024
@jeremylt jeremylt force-pushed the jeremy/shared-at-points branch 6 times, most recently from 37e9e49 to 514184d Compare November 18, 2024 20:28
@jeremylt
Copy link
Member Author

Yay - it works locally but fails on Nother. Those are always fun to debug

@jeremylt
Copy link
Member Author

Cuda is more betterer now. Need to check thread block sizes setup for Hip for 3D

@jeremylt jeremylt force-pushed the jeremy/shared-at-points branch 7 times, most recently from dcfc06f to ff3d054 Compare November 19, 2024 23:24
@jeremylt
Copy link
Member Author

Ok, now every element other than the first in 3D is wrong for HIP. Progress, but super bizarre how it doesn't seem to behave as I expect. Almost the same logic as CUDA so I have to miss some subtle change between HIP shared and CUDA shared as those two have slightly diverged.

@jeremylt
Copy link
Member Author

Ok, the kernels work now. We can optimize them in the future

@jeremylt
Copy link
Member Author

Local testing for Ratel passes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant