-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there any way to parallelize the computation of multiple convolutions? #683
Comments
|
Thank you for your nice reply, i will try what you said. |
Hi Yu, |
Do you know which part of your program is taking most time?
Usually if your application is already bound by IO then the speed will
depend on how much the file system can provide.
Adding some prints of the time can help us find out.
Less critical is making sure cpu cores are not oversubscribed, by making
sure no adequate MKL or OpenMP threads per task.
…On Tue, Sep 19, 2023 at 11:48 PM WangYun1995 ***@***.***> wrote:
Hi Yu,
The method you suggest does work. However, Parallelization is not going to
be very efficient. For example, I use a mesh with Nmesh = 768**3 as a test.
When using only one core, the time taken is 1407 seconds. When using 64
cores, the time taken is 722 seconds. When using 96 cores, the time taken
is 611 seconds.
Obviously, turning on the parallelization only saves half the time, not
1407/64 or 1407/96 seconds.
Why is this the case?
—
Reply to this email directly, view it on GitHub
<#683 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABBWTCH26BDOYSNHN65SMDX3KGT7ANCNFSM6AAAAAA4ZKYTSQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
The operations inside the for loop consume the most time. |
Perhaps it is because the computation and disk IO that leads to dens_m is
repeated for each iteration?
I recall there is a way to create a Mesh from dens_m (FieldMesh?
ArrayMesh)? Cannot recall the name right away). perhaps replace the real
density Mesh object used in the loop with that?
…On Thu, Sep 21, 2023 at 10:06 PM WangYun1995 ***@***.***> wrote:
The operations inside the for loop consume the most time.
—
Reply to this email directly, view it on GitHub
<#683 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABBWTE4L6DOLIE5WJDUPA3X3UME7ANCNFSM6AAAAAA4ZKYTSQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
This really improves the efficiency. I will acknowledge you in my new paper. |
Glad to hear it worked. Thank you!
…On Sat, Sep 23, 2023 at 12:53 AM WangYun1995 ***@***.***> wrote:
This really improves the efficiency. I will acknowledge you in my new
paper.
—
Reply to this email directly, view it on GitHub
<#683 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABBWTFBS2ESZ4IBIFSKXR3X32IPRANCNFSM6AAAAAA4ZKYTSQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi,
First, I have a 3D density mesh (Nmesh=1536^3) obtained by using nbodykit, stored as a bigfile on my disk.
Then I want to convolve the mesh with a band-pass filter at multiple scales, which is equivalent to the multiplication operation in Fourier domain. Finally, at each scale, I compute some binned statistic
env_WPS
of the convolved mesh, and my code snippet is shown below.Since the above code uses only one core, it is inefficient in the case of a large Nmesh. So how to parallelize the code in the context of NBODYKIT?
The text was updated successfully, but these errors were encountered: