Parallelize pixelize_sph_kernel_projection. Fixes #2682 #2683

Xarthisius · 2020-06-24T17:52:35Z

PR Summary

This PR parallelizes sph kernel projection using OpenMP. On my laptop the speed up is modarate (<0.5 for 4 cores), but it yields a significant improvement on testing infra:

$ time python doc/source/cookbook/image_resolution.py   # without this patch

real	16m6.577s
user	16m5.294s
sys	0m4.467s

$ time python doc/source/cookbook/image_resolution.py # with this patch, all cores (40)
real	3m11.529s
user	23m2.800s
sys	0m14.970s

$ export OMP_NUM_THREADS=8  # the default setting we use during a test suite build
$ time python doc/source/cookbook/image_resolution.py 

real	4m9.637s
user	14m26.263s
sys	0m2.226s

chummels · 2020-06-24T19:41:15Z

This is awesome!

Xarthisius · 2020-06-24T19:48:44Z

After switching to an inner loop parallelization:

(blah) fido@c1af1189021c /tmp/yt-4 $ export OMP_NUM_THREADS=8
(blah) fido@c1af1189021c /tmp/yt-4 $ time python doc/source/cookbook/image_resolution.py 

real	2m38.511s
user	17m34.109s
sys	0m2.471s

matthewturk · 2020-06-26T15:11:52Z

I think it's worth leaving a comment in the code mentioning that there are two different regimes where we could apply parallelization. This was optimized for the regime where we have a not-very-large-number of reasonably large-compared-to-pixels particles. So in these, we have a good number of yi values that each one can take on during an xi iteration, so there's "work to be done" in that loop. In the case of lots of itty bitty particles, this wouldn't be quite as efficient.

munkm

These look good to me. I'm on OSX so I can't verify locally.

You mentioned in the PR that one of these changes is the inner loop parallelization? Or did that get overwritten in a push?

Xarthisius · 2020-06-26T15:58:22Z

You mentioned in the PR that one of these changes is the inner loop parallelization? Or did that get overwritten in a push?

All three methods are now using parallelization on the inner loop.

yt/utilities/lib/pixelization_routines.pyx

matthewturk · 2020-06-30T16:22:15Z

yt/utilities/lib/pixelization_routines.pyx

-                    buff[xi, yi] += prefactor_j * kernel_func(q_ij)
+                    local_buff[xi + yi*xsize] += prefactor_j * kernel_func(q_ij)
+
+        with gil:


as long as this is the recommended way to do the reduce, i think it's okay, but i had to check indentation levels to make sure I understood what was happening and which scope this was in.

I don't know if it's the recommended, but that was the only one that worked :)

Xarthisius · 2020-08-13T14:28:19Z

@yt-fido test this please

matthewturk · 2020-09-16T14:19:41Z

@Xarthisius I'd like this to go in, but the conflicts I see are not immediately obvious to me. Can you take another shot?

yt/utilities/lib/pixelization_routines.pyx

Co-authored-by: Clément Robert <cr52@protonmail.com>

neutrinoceros

This looks good to me. My expertise in Cython is still extremely shallow but I think I can approve of the changes presented here.

neutrinoceros · 2020-09-24T17:55:14Z

yt/utilities/lib/pixelization_routines.pyx

@@ -1447,6 +1500,7 @@ def pixelize_sph_kernel_arbitrary_grid(np.float64_t[:, :, :] buff,
    kernel_func = get_kernel_func(kernel_name)

    with nogil:
+        # TODO make this parallel without using too much memory


Just to be clear, is this a leftover that you forgot to address or are you planning to address it in the future ?

I plan to address it in the future, provided that I'll figure out how to do that...

Alright, just checking !

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch 3 times, most recently from e0146b6 to 55af94c Compare June 24, 2020 19:41

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch from 55af94c to 2febb94 Compare June 24, 2020 19:41

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch from 2febb94 to 6c2a48f Compare June 24, 2020 19:56

munkm added enhancement Making something better parallelism MPI-based parallelism yt core Core components and algorithms in yt labels Jun 24, 2020

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch from 6c2a48f to 31397fd Compare June 25, 2020 13:50

munkm approved these changes Jun 26, 2020

View reviewed changes

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch from cdb48f2 to 7a1f12b Compare June 26, 2020 17:52

matthewturk approved these changes Jun 30, 2020

View reviewed changes

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch from 2909cef to d19846d Compare June 30, 2020 17:34

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch 3 times, most recently from c88cc4c to 3e21909 Compare July 23, 2020 00:53

Xarthisius mentioned this pull request Sep 16, 2020

Fix indexing issues in sph pixelization #2909

Merged

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch from 3e21909 to 50adcbc Compare September 17, 2020 18:48

Xarthisius added 2 commits September 24, 2020 09:52

Parallelize pixelize_sph_kernel_* methods. Fixes yt-project#2682

9a3e858

Move ops independent of periodicity out of inner loops. Initialize vars

f323b0a

Xarthisius force-pushed the 2682_parallel_sph_pixelization branch from f220db8 to f323b0a Compare September 24, 2020 14:53

neutrinoceros reviewed Sep 24, 2020

View reviewed changes

yt/utilities/lib/pixelization_routines.pyx Outdated Show resolved Hide resolved

Update yt/utilities/lib/pixelization_routines.pyx

c0f4cbc

Co-authored-by: Clément Robert <cr52@protonmail.com>

neutrinoceros approved these changes Sep 24, 2020

View reviewed changes

neutrinoceros merged commit 4d9ab24 into yt-project:master Sep 24, 2020

Xarthisius deleted the 2682_parallel_sph_pixelization branch September 24, 2020 20:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize pixelize_sph_kernel_projection. Fixes #2682 #2683

Parallelize pixelize_sph_kernel_projection. Fixes #2682 #2683

Xarthisius commented Jun 24, 2020

chummels commented Jun 24, 2020

Xarthisius commented Jun 24, 2020 •

edited

Loading

matthewturk commented Jun 26, 2020

munkm left a comment

Xarthisius commented Jun 26, 2020

matthewturk Jun 30, 2020

Xarthisius Jun 30, 2020

Xarthisius commented Aug 13, 2020

matthewturk commented Sep 16, 2020

neutrinoceros left a comment

neutrinoceros Sep 24, 2020

Xarthisius Sep 24, 2020

neutrinoceros Sep 24, 2020

Parallelize pixelize_sph_kernel_projection. Fixes #2682 #2683

Parallelize pixelize_sph_kernel_projection. Fixes #2682 #2683

Conversation

Xarthisius commented Jun 24, 2020

PR Summary

chummels commented Jun 24, 2020

Xarthisius commented Jun 24, 2020 • edited Loading

matthewturk commented Jun 26, 2020

munkm left a comment

Choose a reason for hiding this comment

Xarthisius commented Jun 26, 2020

matthewturk Jun 30, 2020

Choose a reason for hiding this comment

Xarthisius Jun 30, 2020

Choose a reason for hiding this comment

Xarthisius commented Aug 13, 2020

matthewturk commented Sep 16, 2020

neutrinoceros left a comment

Choose a reason for hiding this comment

neutrinoceros Sep 24, 2020

Choose a reason for hiding this comment

Xarthisius Sep 24, 2020

Choose a reason for hiding this comment

neutrinoceros Sep 24, 2020

Choose a reason for hiding this comment

Xarthisius commented Jun 24, 2020 •

edited

Loading