Multi-round inference using cuda, device error #520

NeuroLife77 · 2021-07-05T18:40:07Z

Hi,

I am not sure if the issue is caused by my misuse or by an actual bug, but I thought I'd report it anyway.

I tried running a multi-round inference with the same syntax as in the tutorial but I defined the SNPE object with specific number of features and transforms and specified to use cuda as device. Everything works on the first round, I simulate, append simulations and train, build the posterior and then set the proposal. But on the second round it crashes at train() saying that it expected all tensors to be on the same device but found tensors on 2 different devices (CPU and cuda). I here's a screenshot of the callback:

janfb · 2021-07-06T10:57:25Z

Hi @NeuroLife77 and thanks for reporting this. Indeed, this seems to be a bug. A quick fix was already suggested in #515 and we are working on a permanent fix in #519

I think, the problem is that the in this case the self._prior is the previous posterior which lives on the GPU, but we generally assumed that the prior lives on the CPU.

Another quick fix is to not train on the GPU ;) are you expecting a lot of speed-up from training on the GPU? E.g., are you using big or convolutional embedding nets or so?

Best,
Jan

NeuroLife77 · 2021-07-06T13:20:43Z

Ok, thank you.

There is a relatively significant speed-up from training on the GPU, but I am not using any nets, I think it's just because I'm using a lot of hidden features and number of transform since it seems to be what works best for me now.

However, since the main bottleneck for the multi-round is the simulation time I think doing it on the CPU will not slow things down significantly.

janfb · 2021-07-08T08:45:14Z

@NeuroLife77 , I see that makes sense with the GPU speed up.

The fix is in main now and should be released soon. Note that from now on when training on the GPU, the prior you pass should be on the GPU as well, see the note in the faq for more details.

NeuroLife77 closed this as completed Jul 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-round inference using cuda, device error #520

Multi-round inference using cuda, device error #520

NeuroLife77 commented Jul 5, 2021

janfb commented Jul 6, 2021

NeuroLife77 commented Jul 6, 2021

janfb commented Jul 8, 2021 •

edited

Loading

Multi-round inference using cuda, device error #520

Multi-round inference using cuda, device error #520

Comments

NeuroLife77 commented Jul 5, 2021

janfb commented Jul 6, 2021

NeuroLife77 commented Jul 6, 2021

janfb commented Jul 8, 2021 • edited Loading

janfb commented Jul 8, 2021 •

edited

Loading