Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fallback to threads if no semaphore support #277

Merged
merged 1 commit into from
Nov 29, 2019
Merged

fallback to threads if no semaphore support #277

merged 1 commit into from
Nov 29, 2019

Conversation

justsh
Copy link
Contributor

@justsh justsh commented Nov 15, 2019

Platforms like Android Termux do not have support for semaphores and
thus cannot use multiprocessing.Pool

Use the multiprocessing.dummy package as a fallback, which provides the
Pool API backed by a ThreadPool.

Fixes #229

@@ -16,6 +15,15 @@
from shutil import which
from typing import List

try:
import multiprocessing.synchronize
Copy link
Contributor

@itsayellow itsayellow Nov 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible just to do:

try:
    from multiprocessing import Pool
except ImportError:
    from multiprocessing.dummy import Pool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Per the stack trace in #229, the error occurs when a Pool is constructed, not imported, at which point multiprocessing.synchronize fails to import the needed semaphore classes from _multiprocessing.

File "/data/data/com.termux/files/home/.local/pipx/venvs/pipx/lib/python3.7/site-packages/pipx/commands.py", line 638, in list_packages
    with multiprocessing.Pool() as p:
  File "/data/data/com.termux/files/usr/lib/python3.7/multiprocessing/context.py", line 119, in Pool
  # snipped...
    File "/data/data/com.termux/files/usr/lib/python3.7/multiprocessing/context.py", line 66, in Lock
    from .synchronize import Lock

I see that flake8 complains about the unused import, that's my bad. I ran nox locally but didn't see any warnings during linting, will investigate on my end.

I can import both a Pool and a ThreadPool and move the try down to when the Pool is instantiated if that's preferable to a #noqa.

Or just switch to threads, your call.

Copy link
Contributor

@itsayellow itsayellow Nov 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lean toward # noqa with an explanatory comment for why we are trying to import something we are not using (a brief version of your above comment.)

Probably @cs01 should chime in on his preferences.

@clbarnes
Copy link
Contributor

In general, is there a benefit to using a multiprocessing.Pool over a multiprocessing.dummy.Pool in general? Could we use the "fallback" in all cases to prevent this platform-specific behaviour?

@itsayellow
Copy link
Contributor

itsayellow commented Nov 19, 2019

In general, is there a benefit to using a multiprocessing.Pool over a multiprocessing.dummy.Pool in general? Could we use the "fallback" in all cases to prevent this platform-specific behaviour?

I believe that multiprocessing.dummy uses threads but has the same API as multiprocessing. The advantage to using multiprocessing.Pool would be to really use all the user's processors to execute the tasks in parallel, which should be faster than using threads.

@clbarnes
Copy link
Contributor

Ah, yes, my fault, got confused about the many implementations of concurrency in python.

We possibly should benchmark the use of threads just in case (even if it's just some "anecdotal" benchmarks) - I suspect the bottlenecks in pip installation are downloads, decompressing zip archives, and compilation if necessary. All that's left is really copying files around. All of these things should release the GIL, so threads may not be much slower to run than processes. Multiple CPUs will be used, as I understand it, because python threads are backed by OS threads (again, as I understand it) and it's python which imposes the lock.

However, if processes is the way to go, I think trying to import the synchronization is a good way to decide which concurrency implementation to use. I'd definitely throw in a comment. I'd be happy with a noqa in that circumstance.

@uranusjr
Copy link
Member

uranusjr commented Nov 19, 2019

I suspect the bottlenecks in pip installation are downloads, decompressing zip archives, and compilation if necessary.

Those pip calls are all in subprocesses, which are not GIL-ed anyway, so it doesn’t matter whether you use threads or forks. A benchmark would be useful, and I suspect the difference would be insignificant.

But this really is a whole other topic on the other hand. I’d suggest merging this first (since it does fix a problem without introducing regressions) with the suggested comment, and open a new issue for whether to switch completely to dummy.

Platforms like Android Termux do not have support for semaphores and
thus cannot use `multiprocessing.Pool`

Use the multiprocessing.dummy package as a fallback, which provides the
Pool API backed by a ThreadPool.

Fixes #229
@justsh
Copy link
Contributor Author

justsh commented Nov 22, 2019

I'd lean toward # noqa with an explanatory comment for why we are trying to import something we are not using (a brief version of your above comment.)

Added the #noqa and updated the comment, feel free to nit.

A benchmark would be useful, and I suspect the difference would be insignificant.

FWIW in the limited, localized testing I did before submitting this PR,
using Processes was only faster than Threads by tenths of a second.
As a naive profile I checked out one worktree with multiprocessing and one with multiprocessing.dummy and used /usr/bin/time pipx run --spec=<worktree_path> pipx list against the dozen venvs I already manage with pipx.

@itsayellow
Copy link
Contributor

And after #222 lands, we won't even run a subprocess to get metadata on a given package, so it may matter even less.

Copy link
Contributor

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally ok with this.

@gaborbernat gaborbernat merged commit 53718d9 into pypa:master Nov 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to run 'pipx list' on Termux (Android)
5 participants