-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelise probabilities #161
Conversation
thewalrus/quantum.py
Outdated
probs = np.maximum( | ||
0.0, np.real_if_close(dask.compute(*compute_list, scheduler="threads")) | ||
).reshape([cutoff] * num_modes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be curious if this is faster in practice! partly because density_matrix_element
is already parallelized, right? iirc it uses OpenMP to compute the hafnian corresponding to a single matrix element, so there is almost a double-parallelization happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be correct. I noticed that it didn't seem to provide any speedups (but only after I pushed it to this PR), but I haven't really benchmarked it yet.
@thisac, @nquesada: It looks like the build is failing because You could try pinning Do you want to make another PR that officially removes Python 3.5 support? This would involve updating the setup.py, the readme, installation instructions, and travis/circle/appveyor configs. |
I think we should stop supporting python 3.5 as @josh146 suggests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my end, will have to wait on a PR that removes Python 3.5 support though.
Codecov Report
@@ Coverage Diff @@
## master #161 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 14 14
Lines 1002 1080 +78
=========================================
+ Hits 1002 1080 +78
Continue to review full report at Codecov.
|
Hey @thisac : Could you add some screen shots of the improvements you found when tweaking OMP_NUM_THREADS here? Also maybe it is worth mentioning that in the docstring? |
Output from some simple benchmarksBelow are the results from some simple benchmarks comparing how the use of parallelisation can speed things up (or slow things down). OpenMP uses parallelisation, which can be turned of by setting the environment variable Dask can use either the As seen below, the fastest run seems to be when just using the Dask parallelisation in from thewalrus.quantum import probabilities as p
import numpy as np
n = 4
mu = np.random.random(2*n)
cov = np.random.random((2*n, 2*n))
cov += cov.conj().T With OpenMP parallelisation and with "threads" sheduler in Daskprint("\nNo parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=False)
print("\nWith parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=True)
Without OpenMP parallelisation and with "threads" sheduler in Dask%set_env OMP_NUM_THREADS=1
print("\nNo parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=False)
print("\nWith parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=True)
Without OpenMP parallelisation and with "processes" sheduler in Dask%set_env OMP_NUM_THREADS=1
print("\nNo parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=False)
print("\nWith parallel excecution with Dask")
%timeit p(mu, cov, cutoff=4, parallel=True)
|
That's great @thisac! It might also be interesting to see how this scales as the number of modes/cutoff increases. |
-0.25 * deltar @ si12 @ deltar | ||
) | ||
return f | ||
# Copyright 2019 Xanadu Quantum Technologies Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thisac, for some reason GitHub is showing this entire file as changed? It's difficult to tell what needs to be code reviewed here.
Has the file mode changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's due to deleting whitespaces on every line. You can remove it from the GitHub diff in the settings (or by appending ?w=1
to the GitHub url).
thewalrus/quantum.py
Outdated
Parallization is already being done by OpenMP when calling ``density_matrix_element``. | ||
To get a speed-up from using ``parallel=True`` it must be turned off by setting the | ||
environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use). Remove the | ||
environment variable or set it to ``OMP_NUM_THREADS=''`` to again use multithreading | ||
with OpenMP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attempted to make it slightly clearer, let me know what you think!
Parallization is already being done by OpenMP when calling ``density_matrix_element``. | |
To get a speed-up from using ``parallel=True`` it must be turned off by setting the | |
environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use). Remove the | |
environment variable or set it to ``OMP_NUM_THREADS=''`` to again use multithreading | |
with OpenMP. | |
Individual density matrix elements are computed using multithreading by OpenMP. | |
Setting ``parallel=True`` will further result in *multiple* density matrix elements | |
being computed in parallel. | |
When setting ``parallel=True``, OpenMP will be turned off by setting the | |
environment variable ``OMP_NUM_THREADS=1`` (forcing single threaded use for individual | |
matrix elements). Remove the environment variable or set it to ``OMP_NUM_THREADS=''`` | |
to again use multithreading with OpenMP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Josh. It looks good. 👍 Just made some small changes to the last line.
thewalrus/quantum.py
Outdated
# restore env variable to value before (or remove if it wasn't set) | ||
if OMP_NUM_THREADS: | ||
os.environ["OMP_NUM_THREADS"] = OMP_NUM_THREADS | ||
else: | ||
del os.environ["OMP_NUM_THREADS"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I only have one (minor) concern --- this is modifying a global environment variable.
If the user is running multiple Python scripts at the same time, or maybe even any other program, then this code-block will cause side effects in the other running processes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this could be an issue. I would say that this is OK if conveyed clearly to the user (e.g. by mentioning it in the docstring). Another option would be to simply avoid changing the environment variable in the function itself and only do it during testing. Then it would be up to the user to switch off OpenMP parallelisation before running with parallel=True
, although this could still pose a similar issue since programs might freeze/crash if not taking this into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about it, it definitely feels a bit strange for a Python library to be changing a users environment variables. If the calculation crashes midway, for instance, the environment variables will never be reset to the original values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think it makes sense to do it in the tests, but otherwise you tell the uer how to do it in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards this being something we communicate clearly in the documentation, but don't actually modify ourselves.
Unless, this is a common occurrence? Are there examples of other python libraries modifying OMP_NUM_THREADS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it's a bit strange, and perhaps a quite bad thing to do. I'll remove it from the function then and simply change the environment variable in the tests (would this still be OK?), and also updating the docstring.
I don't know of any other places modifying OMP_NUM_THREADS
. 🤔
I've moved the environment variable changes to the tests (so no changes are being made in the probabilities function itself), utilising I've also checked locally that |
thewalrus/tests/test_quantum.py
Outdated
@@ -1023,6 +1038,36 @@ def test_update_with_noise_coherent_value_error(): | |||
update_probabilities_with_noise(noise_dists, probs) | |||
|
|||
|
|||
# @pytest.mark.parametrize("test_env_var", [None, "1", "2"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you planning on leaving this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, no, good catch. Should've removed it. 🤦
…lrus into parallelise_probabilities
One can set up the number of OMP threads by doing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a better approach @thisac 💯
Is this ready to be merged? I might have to override CodeCov, looks like it never reported the coverage to GitHub
if parallel: # set single-thread use in OpenMP | ||
monkeypatch.setenv("OMP_NUM_THREADS", "1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Context:
The
probabilities
function could be easily parallelised, potentially providing nice speed-ups.Description of the Change:
Adds a parallelisation option to the
probabilities
function so that the probabilities can be calculated simultaneously. Tests are also adapted to include running theprobabilities
function withparallel=True
andparallel=False
.Benefits:
The probability-calculations can be parallelised and thus possibly done much quicker.
Possible Drawbacks:
Since the OpenMP is already utilising parallelisation in the background, the implementations in this PR does not necessarily provide any speedups, but rather gives the option to parallelise differently.
Related GitHub Issues:
N\A