Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-round inference using cuda, device error #520

Closed
NeuroLife77 opened this issue Jul 5, 2021 · 3 comments
Closed

Multi-round inference using cuda, device error #520

NeuroLife77 opened this issue Jul 5, 2021 · 3 comments

Comments

@NeuroLife77
Copy link

Hi,

I am not sure if the issue is caused by my misuse or by an actual bug, but I thought I'd report it anyway.

I tried running a multi-round inference with the same syntax as in the tutorial but I defined the SNPE object with specific number of features and transforms and specified to use cuda as device. Everything works on the first round, I simulate, append simulations and train, build the posterior and then set the proposal. But on the second round it crashes at train() saying that it expected all tensors to be on the same device but found tensors on 2 different devices (CPU and cuda). I here's a screenshot of the callback:

image
image

@janfb
Copy link
Contributor

janfb commented Jul 6, 2021

Hi @NeuroLife77 and thanks for reporting this. Indeed, this seems to be a bug. A quick fix was already suggested in #515 and we are working on a permanent fix in #519

I think, the problem is that the in this case the self._prior is the previous posterior which lives on the GPU, but we generally assumed that the prior lives on the CPU.

Another quick fix is to not train on the GPU ;) are you expecting a lot of speed-up from training on the GPU? E.g., are you using big or convolutional embedding nets or so?

Best,
Jan

@NeuroLife77
Copy link
Author

Ok, thank you.

There is a relatively significant speed-up from training on the GPU, but I am not using any nets, I think it's just because I'm using a lot of hidden features and number of transform since it seems to be what works best for me now.

However, since the main bottleneck for the multi-round is the simulation time I think doing it on the CPU will not slow things down significantly.

@janfb
Copy link
Contributor

janfb commented Jul 8, 2021

@NeuroLife77 , I see that makes sense with the GPU speed up.

The fix is in main now and should be released soon. Note that from now on when training on the GPU, the prior you pass should be on the GPU as well, see the note in the faq for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants