-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add check before setting multiprocessing context to prevent the RuntimeError #4620
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xuyumeng Thanks for this. I think the change is pretty clear. Once the CI has a chance to run (and assuming no issues seem to arise), I think we can merge this.
@ahnitz Thanks! I'm currently using pycbc as a toolkit in the pycwb+prefect development. This PR enables the pipeline running in script mode but it throws error again in the service mode. I guess the prefect server will run the children process in Does this multiprocessing context setting affect the functionality of pycbc as a toolkit? If not, is it possible to change raising error to warning so that we can use pycbc within other task management system like prefect? |
@xuyumeng For toolkit functionality it is not usually needed. It is needed in some inference-related cases though as fork is the only method which reliable preserves copy on write (faster and reduces memory bloat because common data never needs to be copied or duplicated). Yes, for your use, it should be fine to change the default start method. I think your intent here is right, e.g. if a choice has already made don't try to set it again. I guess somehow that didn't quite pick up the case here, even if the context was already chosen? Do you have the error? It might provide some clues why your patch didn't work. Naively it looks like it would have handled this case. If that turns out not be possible for some reason, I think the solution is for us to set this in more limited cases, e.g. remove the global configuration for the package and set specifically in the cases we know it is needed. Secondarily, I think prefect uses dask. Have you tried using the fork method with dask? That may not be the better fix though. |
@ahnitz The error occurs because Dask is using import pycbc
from prefect import flow, task
from prefect_dask import DaskTaskRunner
@task
def test_pycbc():
print("pycbc version: ", pycbc.__version__)
return pycbc.__version__
@flow(task_runner=DaskTaskRunner(cluster_kwargs={"n_workers": 4, "processes": True, "threads_per_worker": 1}),
log_prints=True, retries=1)
def pycbc_flow():
for i in range(4):
test_pycbc.submit()
if __name__ == "__main__":
pycbc_flow.serve(name="pycbc-flow") will trigger the line 193 in my patch, since the context is set to I don't know why there is no such issue if I just change the last line to It would be nice if the context is limited to the place where it is needed. I will also have a look at the prefect-dask to see if I can change the default multiprocessing context. |
@xuyumeng I think in this case, you can turn the error into a warning. That way it will make a good faith attempt to set the context, but provide a warning otherwise. It would be fine I think to be a logging warning message so that it can also be silence if someone knows what they are doing at a higher level. |
@xuyumeng I think you just need to rebase this from master so we can check the tests complete. Also can you confirm this solves the issue on your side? If so, once the tests pass, please merge this. |
@ahnitz Hi, yes, this solves my issue. I have a technical question. I used merge to update this branch from master last time and switching to rebase will require a force push. Is it safe to do this for PR? |
@xuyumeng You update the branch to master, so we can double check that the tests pass before merging? That's the only thing preventing this from being added. You can use merge or rebase. It's just better practice to do rebase rather than a merge as it creates problems comparing between branches. |
…me error: context has already been set When `pycbc` is used with some multiprocessing package such as `dask`, importing `pycbc` will cause setting the multiprocessing context repeatedly. The `RuntimeError: context has already been set` will happen. This fix will check the context before setting it.
Change raising error to warning
a29c883
to
bcd0850
Compare
@ahnitz Hi, I updated the branch to master with rebase. |
…meError (gwastro#4620) * Add check before setting multiprocessing context to prevent the runtime error: context has already been set When `pycbc` is used with some multiprocessing package such as `dask`, importing `pycbc` will cause setting the multiprocessing context repeatedly. The `RuntimeError: context has already been set` will happen. This fix will check the context before setting it. * Update __init__.py Change raising error to warning * wrap the lines --------- Co-authored-by: Yumeng Xu <yumeng.xu@physik.uzh.ch>
When
pycbc
is used with some multiprocessing package such asdask
, importingpycbc
will set the multiprocessing context repeatedly. TheRuntimeError: context has already been set
will occur. This fix will check the context before setting it, to make sure it will only be set once.Error message
Example to reproduce the error
OS: MacOS 14.2.1
Create a temporary env
Simple script
test.py
to reproduce the error