Parallelise probabilities #161

thisac · 2020-04-21T17:23:29Z

Context:
The probabilities function could be easily parallelised, potentially providing nice speed-ups.

Description of the Change:
Adds a parallelisation option to the probabilities function so that the probabilities can be calculated simultaneously. Tests are also adapted to include running the probabilities function with parallel=True and parallel=False.

Benefits:
The probability-calculations can be parallelised and thus possibly done much quicker.

Possible Drawbacks:
Since the OpenMP is already utilising parallelisation in the background, the implementations in this PR does not necessarily provide any speedups, but rather gives the option to parallelise differently.

Related GitHub Issues:
N\A

josh146 · 2020-04-22T03:20:01Z

thewalrus/quantum.py

+        probs = np.maximum(
+            0.0, np.real_if_close(dask.compute(*compute_list, scheduler="threads"))
+        ).reshape([cutoff] * num_modes)


Would be curious if this is faster in practice! partly because density_matrix_element is already parallelized, right? iirc it uses OpenMP to compute the hafnian corresponding to a single matrix element, so there is almost a double-parallelization happening

This might be correct. I noticed that it didn't seem to provide any speedups (but only after I pushed it to this PR), but I haven't really benchmarked it yet.

josh146 · 2020-04-22T03:22:42Z

@thisac, @nquesada: It looks like the build is failing because llvmlite no longer provides wheels for Python 3.5 on the latest version.

You could try pinning llvmlite+numba to the previous version that supported Python 3.5, however it probably makes sense to deprecate Python 3.5 support.

Do you want to make another PR that officially removes Python 3.5 support? This would involve updating the setup.py, the readme, installation instructions, and travis/circle/appveyor configs.

nquesada · 2020-04-23T22:15:53Z

I think we should stop supporting python 3.5 as @josh146 suggests.

josh146

Looks good from my end, will have to wait on a PR that removes Python 3.5 support though.

codecov · 2020-04-24T20:58:36Z

Codecov Report

Merging #161 into master will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #161   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           14        14           
  Lines         1002      1080   +78     
=========================================
+ Hits          1002      1080   +78

Impacted Files	Coverage Δ
thewalrus/quantum.py	`100.00% <100.00%> (ø)`
thewalrus/samples.py	`100.00% <0.00%> (ø)`
thewalrus/csamples.py	`100.00% <0.00%> (ø)`
thewalrus/symplectic.py	`100.00% <0.00%> (ø)`
thewalrus/fock_gradients.py	`100.00% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf7a19e...8a519ad. Read the comment docs.

nquesada · 2020-04-30T15:45:08Z

Hey @thisac : Could you add some screen shots of the improvements you found when tweaking OMP_NUM_THREADS here? Also maybe it is worth mentioning that in the docstring?
For example explain to the user when it is worth allowing parallel=True and how to do it by exporting environment variables. Other than that I think this is ready to be merged.

thisac · 2020-04-30T19:41:46Z

Output from some simple benchmarks

Below are the results from some simple benchmarks comparing how the use of parallelisation can speed things up (or slow things down).

OpenMP uses parallelisation, which can be turned of by setting the environment variable OMP_NUM_THREADS=1, meaning that only a single thread will be used. By default this is set to use all threads (8 in this case).

Dask can use either the "threads" (uses multiple threads in the same process) or the "processes" (sends data to separate processes) scheduler. "threads" is bound by the GIL and is thus best to use with non-python objects while "processes" works best with pure python code (with a slight overhead). See https://docs.dask.org/en/latest/setup/single-machine.html for more information.

As seen below, the fastest run seems to be when just using the Dask parallelisation in probabilities while using the "processes" scheduler and while turning off OpenMP parallelisation.

from thewalrus.quantum import probabilities as p
import numpy as np

n = 4
mu = np.random.random(2*n)
cov = np.random.random((2*n, 2*n))
cov += cov.conj().T

With OpenMP parallelisation and with "threads" sheduler in Dask

print("\nNo parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=False)

print("\nWith parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=True)

No parallel excecution with Dask
419 ms ± 74.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With parallel excecution with Dask
2.72 s ± 58.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Without OpenMP parallelisation and with "threads" sheduler in Dask

%set_env OMP_NUM_THREADS=1

print("\nNo parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=False)

print("\nWith parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=True)

env: OMP_NUM_THREADS=1

No parallel excecution with Dask
632 ms ± 28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With parallel excecution with Dask
748 ms ± 2.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Without OpenMP parallelisation and with "processes" sheduler in Dask

%set_env OMP_NUM_THREADS=1

print("\nNo parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=False)

print("\nWith parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=True)

env: OMP_NUM_THREADS=1

No parallel excecution with Dask
605 ms ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With parallel excecution with Dask
302 ms ± 6.04 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

josh146 · 2020-05-01T03:54:54Z

That's great @thisac! It might also be interesting to see how this scales as the number of modes/cutoff increases.

josh146 · 2020-05-21T04:25:05Z

thewalrus/quantum.py

-        -0.25 * deltar @ si12 @ deltar
-    )
-    return f
+# Copyright 2019 Xanadu Quantum Technologies Inc.


@thisac, for some reason GitHub is showing this entire file as changed? It's difficult to tell what needs to be code reviewed here.

Has the file mode changed?

It's due to deleting whitespaces on every line. You can remove it from the GitHub diff in the settings (or by appending ?w=1 to the GitHub url).

josh146 · 2020-05-21T04:29:47Z

thewalrus/quantum.py

+        Parallization is already being done by OpenMP when calling ``density_matrix_element``.
+        To get a speed-up from using ``parallel=True`` it must be turned off by setting the
+        environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use). Remove the
+        environment variable or set it to ``OMP_NUM_THREADS=''`` to again use multithreading
+        with OpenMP.


Attempted to make it slightly clearer, let me know what you think!

Suggested change

Parallization is already being done by OpenMP when calling ``density_matrix_element``.

To get a speed-up from using ``parallel=True`` it must be turned off by setting the

environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use). Remove the

environment variable or set it to ``OMP_NUM_THREADS=''`` to again use multithreading

with OpenMP.

Individual density matrix elements are computed using multithreading by OpenMP.

Setting ``parallel=True`` will further result in *multiple* density matrix elements

being computed in parallel.

When setting ``parallel=True``, OpenMP will be turned off by setting the

environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use for individual

matrix elements). Remove the environment variable or set it to ``OMP_NUM_THREADS=''``

to again use multithreading with OpenMP.

Thanks Josh. It looks good. 👍 Just made some small changes to the last line.

josh146 · 2020-05-21T04:34:32Z

thewalrus/quantum.py

+        # restore env variable to value before (or remove if it wasn't set)
+        if OMP_NUM_THREADS:
+            os.environ["OMP_NUM_THREADS"] = OMP_NUM_THREADS
+        else:
+            del os.environ["OMP_NUM_THREADS"]


Nice! I only have one (minor) concern --- this is modifying a global environment variable.

If the user is running multiple Python scripts at the same time, or maybe even any other program, then this code-block will cause side effects in the other running processes

I agree that this could be an issue. I would say that this is OK if conveyed clearly to the user (e.g. by mentioning it in the docstring). Another option would be to simply avoid changing the environment variable in the function itself and only do it during testing. Then it would be up to the user to switch off OpenMP parallelisation before running with parallel=True, although this could still pose a similar issue since programs might freeze/crash if not taking this into account.

The more I think about it, it definitely feels a bit strange for a Python library to be changing a users environment variables. If the calculation crashes midway, for instance, the environment variables will never be reset to the original values.

yeah, I think it makes sense to do it in the tests, but otherwise you tell the uer how to do it in the docstring.

I'm leaning towards this being something we communicate clearly in the documentation, but don't actually modify ourselves.

Unless, this is a common occurrence? Are there examples of other python libraries modifying OMP_NUM_THREADS?

I agree that it's a bit strange, and perhaps a quite bad thing to do. I'll remove it from the function then and simply change the environment variable in the tests ~~(would this still be OK?)~~, and also updating the docstring.

I don't know of any other places modifying OMP_NUM_THREADS. 🤔

thisac · 2020-05-22T16:25:50Z

I've moved the environment variable changes to the tests (so no changes are being made in the probabilities function itself), utilising monkeypatch.setenv (thanks @josh146 🥇).

I've also checked locally that OMP_NUM_THREADS=1 whenever parallel=True during the testing. The docstring is also slightly altered.

nquesada · 2020-05-22T19:14:02Z

thewalrus/tests/test_quantum.py

@@ -1023,6 +1038,36 @@ def test_update_with_noise_coherent_value_error():
        update_probabilities_with_noise(noise_dists, probs)


+# @pytest.mark.parametrize("test_env_var", [None, "1", "2"])


are you planning on leaving this here?

Oh, no, good catch. Should've removed it. 🤦

…lrus into parallelise_probabilities

nquesada · 2020-05-22T20:04:00Z

One can set up the number of OMP threads by doing os.environ["OMP_NUM_THREADS"] = "1"

josh146

I think this is a better approach @thisac 💯

Is this ready to be merged? I might have to override CodeCov, looks like it never reported the coverage to GitHub

josh146 · 2020-05-23T03:33:56Z

thewalrus/tests/test_quantum.py

+    if parallel: # set single-thread use in OpenMP
+        monkeypatch.setenv("OMP_NUM_THREADS", "1")


thisac added 2 commits April 21, 2020 13:18

Add parallelisation option for probabilities

9843226

Include parallel in probability tests

a2eae5b

thisac requested a review from nquesada April 21, 2020 17:23

thisac self-assigned this Apr 21, 2020

josh146 reviewed Apr 22, 2020

View reviewed changes

josh146 approved these changes Apr 24, 2020

View reviewed changes

josh146 added python review-ready labels Apr 24, 2020

thisac mentioned this pull request Apr 24, 2020

Remove py35 support #163

Merged

nquesada added 2 commits April 24, 2020 15:30

Merge branch 'nquesada-patch-1' into parallelise_probabilities

6c5338e

Merge branch 'master' into parallelise_probabilities

12cadd6

Change Dask scheduler to "processes"

e2d82a7

Add note in docstring

61c1cdc

thisac and others added 6 commits May 20, 2020 16:26

Set OMP_NUM_THREADS when parallel

9332928

Merge branch 'master' into parallelise_probabilities

6440d21

Update env variable setting/deleting method

3f15f05

Add test for env variable change

24a099c

Import os in test_quantum

fb4a152

Fix test

455b315

thisac requested a review from josh146 May 20, 2020 22:54

josh146 reviewed May 21, 2020

View reviewed changes

thisac added 3 commits May 22, 2020 10:15

Update docstring note

0b85498

Fix docstring

d2fdcdd

Move env-var changes to tests

f843ecc

thisac and others added 2 commits May 22, 2020 12:26

Remove os import

b4cc14a

Update CHANGELOG.md

a547d46

nquesada reviewed May 22, 2020

View reviewed changes

nquesada approved these changes May 22, 2020

View reviewed changes

thisac added 2 commits May 22, 2020 15:38

Remove worthless test

3a3c838

Merge branch 'parallelise_probabilities' of github.com:XanaduAI/thewa…

8a519ad

…lrus into parallelise_probabilities

josh146 approved these changes May 23, 2020

View reviewed changes

josh146 merged commit 830f74c into master May 23, 2020

josh146 deleted the parallelise_probabilities branch May 23, 2020 23:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelise probabilities #161

Parallelise probabilities #161

thisac commented Apr 21, 2020 •

edited

Loading

josh146 Apr 22, 2020

thisac Apr 22, 2020

josh146 commented Apr 22, 2020

nquesada commented Apr 23, 2020

josh146 left a comment

codecov bot commented Apr 24, 2020 •

edited

Loading

nquesada commented Apr 30, 2020

thisac commented Apr 30, 2020 •

edited

Loading

josh146 commented May 1, 2020

josh146 May 21, 2020

thisac May 21, 2020

josh146 May 21, 2020

thisac May 22, 2020

josh146 May 21, 2020

thisac May 21, 2020

josh146 May 22, 2020

nquesada May 22, 2020

josh146 May 22, 2020

thisac May 22, 2020 •

edited

Loading

thisac commented May 22, 2020

nquesada May 22, 2020

thisac May 22, 2020

nquesada commented May 22, 2020

josh146 left a comment

josh146 May 23, 2020

-        Parallization is already being done by OpenMP when calling ``density_matrix_element``.
-        To get a speed-up from using ``parallel=True`` it must be turned off by setting the
-        environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use). Remove the
-        environment variable or set it to ``OMP_NUM_THREADS=''`` to again use multithreading
-        with OpenMP.
+        Individual density matrix elements are computed using multithreading by OpenMP.
+        Setting ``parallel=True`` will further result in *multiple* density matrix elements
+        being computed in parallel.
+        When setting ``parallel=True``, OpenMP will be turned off by setting the
+        environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use for individual
+        matrix elements). Remove the environment variable or set it to ``OMP_NUM_THREADS=''``
+        to again use multithreading with OpenMP.

		@@ -1023,6 +1038,36 @@ def test_update_with_noise_coherent_value_error():
		update_probabilities_with_noise(noise_dists, probs)


		# @pytest.mark.parametrize("test_env_var", [None, "1", "2"])

		if parallel: # set single-thread use in OpenMP
		monkeypatch.setenv("OMP_NUM_THREADS", "1")

Parallelise probabilities #161

Parallelise probabilities #161

Conversation

thisac commented Apr 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh146 commented Apr 22, 2020

nquesada commented Apr 23, 2020

josh146 left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 24, 2020 • edited Loading

Codecov Report

nquesada commented Apr 30, 2020

thisac commented Apr 30, 2020 • edited Loading

Output from some simple benchmarks

With OpenMP parallelisation and with "threads" sheduler in Dask

Without OpenMP parallelisation and with "threads" sheduler in Dask

Without OpenMP parallelisation and with "processes" sheduler in Dask

josh146 commented May 1, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thisac May 22, 2020 • edited Loading

Choose a reason for hiding this comment

thisac commented May 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nquesada commented May 22, 2020

josh146 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thisac commented Apr 21, 2020 •

edited

Loading

codecov bot commented Apr 24, 2020 •

edited

Loading

thisac commented Apr 30, 2020 •

edited

Loading

thisac May 22, 2020 •

edited

Loading