-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite parallel sampling using multiprocessing #3011
Conversation
I was under the impression that its simpler to use |
For computations that don't need to communicate with the main process and that are never aborted I think that is true. But we can profit quite a bit if we keep talking to the main process in our use case. This makes status updates (progress bar), writing results to files and interrupting the sampler much easier. And that is not supported well by |
Another nice bonus: We can return a partial trace now if parallel sampling is interrupted with a KeyboardInterrupt. |
ec02da7
to
c6876c8
Compare
One other option to entertain is that since we plan to drop Python 2 support anyhow, we already start by deprecating parallel sampling for python 2. I suppose it's not really worth it because we have something that sorta works and we would just take that away for seemingly no good reason. Yet, the aggregate cost of dealing with it is quite high. |
What is our timeline on py27? |
There are still some tests on py2.7/float32 that are failing (due to issues with sqlite and text backends), but I think that is just the unrelated #3018. So this is ready for review/merge. |
This is the one progress bar version right (ie, no snail race)? |
Yes |
shape_dtypes[var.name] = (shape, dtype) | ||
return shape_dtypes | ||
|
||
def stop_tuning(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, need release note.
|
||
self._progress = None | ||
if progressbar: | ||
self._progress = tqdm_( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have to add the position argument here in order to not have tqdms interfering with each other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only one progress bar now, but that progress bar counts samples from all chains.
This PR gets rid of joblib altogether, and replaces it with a custom implementation.
This should solve some long standing problems:
pipe.send
. Very large traces can't use multiprocessing right now.We start one process per chain, that communicates with the main thread by sending messages through a pipe. Ones a draw is ready, it is stored in shared memory, and the main process can access it. Then the sampler process is asked to write the next sample to that shared memory.
Performance wise it shouldn't be much different, the pipes seem to be fast enough to deal with the speed of our sampler. It is possible that we gain a bit of speed for large models with cheap computations, as we have a dedicated thread for sampling, that doesn't also have to deal with storing data in the trace. For very small models it might be a bit slower, as we do have some small constant overhead for sending messages.
Since I use a couple of py3 only features, the old code is still used if py<3
TODO
set_custom_exc