Allow prior on gpu #519

janfb · 2021-07-05T15:26:44Z

Relates to #515

Up to now we assumed the prior live on the cpu and we moved samples to .cpu() whenever combining log_probs from the prior and the net.

This PR now allows the prior to live on the GPU, and it asserts that the prior lives on the same device as the passed device for training

Pro: we don't need to move to .cpu() all the time
Con:

all the numpy based MCMC methods naturally happen on the cpu. when evaluating the theta on the prior, we now have to move it to the prior device (instead of doing net.logprob(theta).cpu())
now the user will get an AssertionError when the prior was not on initialised on the device. This was not the case before. Alternatively, we could introduce or deduce prior_device and take care of moving things around internally. Any opinions on that?

I haven't profiled it, but I think this way of doing it is faster than the old way because we move things less. And if we one day implement the vectorized MCMC in torch, we might get speed ups when running that on the GPU then.

closes #515

michaeldeistler · 2021-07-05T16:08:28Z

This is great, thanks! It'll also be useful e.g. when the prior is a previous posterior.

I do not have a strong opinion on the assertion -- I have a slight preference for automatically moving the prior and giving a warning though. This is also what we do if the simulations lie on the GPU, so it feels natural to do the same thing for the prior. As I said, no strong opinions though. Feel free to merge if you think assert is better.

janfb · 2021-07-06T07:55:03Z

thanks @michaeldeistler

regarding moving the prior to GPU, I haven't found a way for doing this actually. One would have to create a new prior instance with the parameters living on gpu. and for that one basically would need a long if-else construct to check all possible prior types...
So what I meant was to allow the case that prior lives on CPU but the training happens on GPU and to then fall back to the old solution of moving everything to CPU when the prior is involved (and maybe issue a warning that it might be better to pass a prior on the GPU). The other case, prior on GPU and training on CPU doesn't make much sense and would throw an error.
Would you agree?

michaeldeistler · 2021-07-06T08:04:30Z

That makes sense.

I think my preferred way would then be to stick to the current implementation using assert. However, we should add the device argument to BoxUniform, otherwise the error-message will be confusing for users.

codecov-commenter · 2021-07-06T11:40:48Z

Codecov Report

Merging #519 (737e4cd) into main (df147b1) will increase coverage by 0.08%.
The diff coverage is 82.92%.

❗ Current head 737e4cd differs from pull request most recent head 6e73531. Consider uploading reports for the commit 6e73531 to get more accurate results

@@            Coverage Diff             @@
##             main     #519      +/-   ##
==========================================
+ Coverage   67.70%   67.79%   +0.08%     
==========================================
  Files          55       55              
  Lines        3970     3965       -5     
==========================================
  Hits         2688     2688              
+ Misses       1282     1277       -5

Flag	Coverage Δ
unittests	`67.79% <82.92%> (+0.08%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
sbi/analysis/conditional_density.py	`75.00% <ø> (ø)`
sbi/utils/sbiutils.py	`71.09% <33.33%> (ø)`
...inference/posteriors/likelihood_based_posterior.py	`72.22% <60.00%> (-0.76%)`	⬇️
sbi/inference/posteriors/base_posterior.py	`66.00% <75.00%> (ø)`
sbi/inference/posteriors/ratio_based_posterior.py	`78.82% <75.00%> (+1.55%)`	⬆️
sbi/inference/posteriors/direct_posterior.py	`79.61% <87.50%> (+1.83%)`	⬆️
sbi/inference/base.py	`78.41% <100.00%> (ø)`
sbi/inference/snpe/snpe_a.py	`66.50% <100.00%> (ø)`
sbi/mcmc/slice.py	`98.57% <100.00%> (ø)`
sbi/utils/metrics.py	`36.90% <100.00%> (+0.76%)`	⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update df147b1...6e73531. Read the comment docs.

janfb · 2021-07-06T14:54:58Z

@michaeldeistler I refactored to potential function to handle the devices efficiently. Do you still approve?

michaeldeistler

It looks good, thanks. I made two comments. I think replacing get_potential with posterior_potential would make sense, no?

sbi/inference/posteriors/direct_posterior.py

janfb · 2021-07-08T07:46:08Z

slow tests are passing now. I had to fix small things but importantly I had to remove the hmc tests with uniform prior because they were taking forever. I hope we can fix that with moving to unconstrained space via #510

* remove hmc with uniform prior test, too slow.

janfb self-assigned this Jul 5, 2021

janfb requested review from michaeldeistler and jan-matthis July 5, 2021 15:27

michaeldeistler approved these changes Jul 5, 2021

View reviewed changes

janfb mentioned this pull request Jul 6, 2021

Multi-round inference using cuda, device error #520

Closed

janfb force-pushed the prior-on-gpu branch from 79d9f5d to 7d47c71 Compare July 6, 2021 11:21

janfb force-pushed the prior-on-gpu branch from 556b285 to 6901dcc Compare July 6, 2021 14:37

michaeldeistler reviewed Jul 6, 2021

View reviewed changes

sbi/inference/posteriors/direct_posterior.py Outdated Show resolved Hide resolved

janfb force-pushed the prior-on-gpu branch 3 times, most recently from 1989fbd to 5fa0770 Compare July 8, 2021 06:25

janfb force-pushed the prior-on-gpu branch from 5fa0770 to df29bce Compare July 8, 2021 07:47

janfb and others added 4 commits July 8, 2021 09:55

allow prior on GPU and assert same as training device, adapt tests.

aa65b7e

formatting.

b11b009

add device arg to BoxUniform, refactor tests.

0d57788

refactor potential fns for device handling, adapt tests.

737e4cd

* remove hmc with uniform prior test, too slow.

janfb force-pushed the prior-on-gpu branch from df29bce to 737e4cd Compare July 8, 2021 07:56

adapt faq for prior on gpu.

6e73531

janfb merged commit fb587e9 into main Jul 8, 2021

janfb deleted the prior-on-gpu branch July 8, 2021 08:37

janfb mentioned this pull request Jul 8, 2021

ABC tests are failing #518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow prior on gpu #519

Allow prior on gpu #519

janfb commented Jul 5, 2021

michaeldeistler commented Jul 5, 2021

janfb commented Jul 6, 2021 •

edited

Loading

michaeldeistler commented Jul 6, 2021

codecov-commenter commented Jul 6, 2021 •

edited

Loading

janfb commented Jul 6, 2021

michaeldeistler left a comment

janfb commented Jul 8, 2021

Allow prior on gpu #519

Allow prior on gpu #519

Conversation

janfb commented Jul 5, 2021

michaeldeistler commented Jul 5, 2021

janfb commented Jul 6, 2021 • edited Loading

michaeldeistler commented Jul 6, 2021

codecov-commenter commented Jul 6, 2021 • edited Loading

Codecov Report

janfb commented Jul 6, 2021

michaeldeistler left a comment

Choose a reason for hiding this comment

janfb commented Jul 8, 2021

janfb commented Jul 6, 2021 •

edited

Loading

codecov-commenter commented Jul 6, 2021 •

edited

Loading