Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group k for Unweighted on GPU #68

Draft
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

sfiligoi
Copy link
Collaborator

By grouping indexes that have zero-sum, we minimize divergence of GPU warps.
This gives a noticeable speedup in Unweighted (due to expensive pseudo-random lookup for some paths).
Not done for Weighted, since it uses a more regular memory pattern that would be disrupted by grouping.

@sfiligoi sfiligoi marked this pull request as draft February 22, 2025 01:42
@sfiligoi
Copy link
Collaborator Author

On a RTX4060,
Unweighted EMP goes from 36s to 29s
Unweighted AG goes from 116s to 96s.

@sfiligoi
Copy link
Collaborator Author

On the 780M iGPU, part of AMD Ryzen 9 7940HS
Unweighted EMP goes from 129s to 115s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant