Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why use ProcessPoolExecutor at all? #44

Open
iiLaurens opened this issue Jul 4, 2022 · 1 comment
Open

Why use ProcessPoolExecutor at all? #44

iiLaurens opened this issue Jul 4, 2022 · 1 comment
Labels
implemented At least you try

Comments

@iiLaurens
Copy link

iiLaurens commented Jul 4, 2022

The code comments mention that ProcessPoolExecutor is used in favour of ThreadPoolExecutor and mentioning the python GIL as one of the reasons. I would like to argue that it ThreadPoolExecutor is perfectly fine in this use case.

First of all, the GIL is only a problem for threads when threads execute python code. The GIL only allows one thread to use the python interpreter, which renders the other threads useless. However for I/O tasks, python releases the GIL, asks the OS to handle the I/O request, and suspends the active python thread so that other threads can continue.

The grobid client is simply a wrapper that sends a batch of post requests. No heavy calculations are done on the python since and hence using ThreadPoolExecutor is perfectly fine, has much less overhead and is much less troublesome across different OS'es. Would it be possible to make the ThreadPoolExecutor default?

@kermitt2
Copy link
Owner

kermitt2 commented Jul 4, 2022

Hi @iiLaurens !

Thank you for the issue, you're absolutely right. I am actually using ThreadPoolExecutor in my more recent python clients for I/O intensive tasks. I think at the time I wrote this client (4 years ago), I was a bit confused by this aspect and I didn't come back to it afterwards.

I push an update replacing ProcessPoolExecutor - see e7710c2

@kermitt2 kermitt2 added the implemented At least you try label Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implemented At least you try
Projects
None yet
Development

No branches or pull requests

2 participants