Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore] use rayon work-stealing to improve evaluate_h #28

Merged
merged 6 commits into from
Nov 23, 2023

Conversation

jonathanpwang
Copy link
Collaborator

@jonathanpwang jonathanpwang commented Nov 23, 2023

I have no idea how rayon work-stealing works, but by adding some par_iters to the multiple FFTs, it has improved the performance substantially on machines with large numbers of cores.

We have also discovered that Scroll's parallel FFT is better than the recursive FFT on c7a machines (8xl up to 48xl). So we use cfg(target_arch = "x86_64") to select which one to use now.

@jonathanpwang jonathanpwang merged commit f335ffc into main Nov 23, 2023
@jonathanpwang jonathanpwang deleted the chore/evaluate_h_par_iter branch November 23, 2023 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant