-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] - Feature/numba parallel #43
base: dev
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 142
💛 - Coveralls |
On IBM Power8: (venv-pynomaly) vconstan@SNA-MINSKY-N03:~/projects/PyNomaly$ python examples/numba_speed_diff.py
/home/vconstan/projects/PyNomaly/PyNomaly/loop.py:518: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function _compute_distance_and_neighbor_matrix failed at nopython mode lowering due to: scipy 0.16+ is required for linear algebra
File "PyNomaly/loop.py", line 537:
def _compute_distance_and_neighbor_matrix(
<source elided>
diff = clust_points_vector[p[0]] - clust_points_vector[p[1]]
d = np.dot(diff, diff) ** 0.5
^
During: lowering "$88call_method.23 = call $82load_method.20(diff, diff, func=$82load_method.20, args=[Var(diff, loop.py:536), Var(diff, loop.py:536)], kws=(), vararg=None)" at /home/vconstan/projects/PyNomaly/PyNomaly/loop.py (537)
@staticmethod
/home/vconstan/.conda/envs/venv-pynomaly/lib/python3.8/site-packages/numba/core/object_mode_passes.py:177: NumbaWarning: Function "_compute_distance_and_neighbor_matrix" was compiled in object mode without forceobj=True.
File "PyNomaly/loop.py", line 519:
@staticmethod
def _compute_distance_and_neighbor_matrix(
^
warnings.warn(errors.NumbaWarning(warn_msg,
/home/vconstan/.conda/envs/venv-pynomaly/lib/python3.8/site-packages/numba/core/object_mode_passes.py:187: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.
For more information visit http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit
File "PyNomaly/loop.py", line 519:
@staticmethod
def _compute_distance_and_neighbor_matrix(
^
warnings.warn(errors.NumbaDeprecationWarning(msg, |
Given that there is a trade-off between the number of cores to utilize in parallel computation and communication between the parallel threads, it may be nice to allow users to set the number of concurrent threads to execute in parallel. This seems to be set through a Numba environmental variable, and may be worth exploring adding as an additional, optional parameter when executing distance calculations in parallel: https://numba.pydata.org/numba-doc/latest/user/threading-layer.html#setting-the-number-of-threads |
Added a
More investigation is needed to see if the above behavior is machine-specific or code related, but we now have the ability to parallelize distinct portions of the code and set the number of threads as well when using numba. |
Results from another machine:
|
Results from another run. |
Results from another machine (4 core CPU, running from WSL):
|
Refactored how the processing is handled so that we see a speed improvement when using Numba and upping the number of cores. Once I handle the below issue, I'll report back with some numbers in regards to speed of computation. To accomplish multi-core processing, this necessitated changes in the progress bar, which is still a work in progress. One of the key challenges currently is to flush the stdout in such a way that is compatible with Numba. While print statements are supported with Numba compiled functions, it doesn't seem that sys.stdout.flush() is supported. |
Placing this issue on hold while other repository issues are resolved - this is low priority and can be resolved at a later time. |
This feature addresses #36 and adds parallelization to the distance calculation between observations through the optional Numba library (which JIT compiles the code for faster run times). While parallelization is confirmed through testing using
htop
(see below screenshot), some further testing is needed before merging into thedev
branch and later tomain
for public use.No Parallelization, only Numba JIT
Numba JIT with Parallelization
It should be noted that any speed increases brought through parallelization will not be utilized if a pre-existing distance matrix is provided for calculation of local outlier probability scores (which is possible with PyNomaly). This has been noted in
readme.md
shortly after introducing the option of parallelization.Note that in order to function on both an Intel Core Atom (circa 2015, 2 cores) and an Intel Core i9 (circa 2019, 8 cores), a newer version of numba was required, moving from version
0.45.x
to0.51.2
. Speed improvements - as a percentage of the original speed - were greater on the Atom processor compared to the Core i9. Testing on x86 CPU architectures has so far been successful, but Numba seems to be unable to JIT compile the code on IBM Power8 CPUs (>= 16 cores).The code will now be tested in several different environment prior to merging, with any issues and successes reported here.