n_epochs > 1 : multinomial sampling error probs<0 #13

wasd12345 · 2018-12-21T08:37:57Z

Hi, for both the sorting task and the TSP task, when I run with any n_epochs > 1, toward the beginning of the 2nd epoch I get the following error in the stochastic decoder in neural_combinatorial_rl.py related to multinomial distribution sampling. When I print out "probs" I see that they very quickly converge to all 0's and a 1 distribution, then the following iteration are all nans. Any idea what's going on here, or did you just end up avoiding this by running with 1 epoch with many iterations?

Thanks!

(using the pytorch-0.4 branch)

(below is shown for the sort_10 task)

probs
tensor([[0.0000, 0.0000, 0.0000, 0.5416, 0.4584, 0.0000],
[0.0000, 0.0000, 0.5079, 0.4921, 0.0000, 0.0000],
[0.0000, 0.5320, 0.0000, 0.0000, 0.0000, 0.4680],
[0.0000, 0.0000, 0.0000, 0.5146, 0.4854, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.5275, 0.4725],
[0.0000, 0.0000, 0.4794, 0.0000, 0.0000, 0.5206],
[0.4945, 0.5055, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.5195, 0.0000, 0.4805, 0.0000],
[0.5135, 0.0000, 0.0000, 0.0000, 0.4865, 0.0000],
[0.4877, 0.0000, 0.5123, 0.0000, 0.0000, 0.0000],
[0.5207, 0.0000, 0.0000, 0.0000, 0.0000, 0.4793],
[0.0000, 0.5231, 0.0000, 0.0000, 0.0000, 0.4769]],
grad_fn=)
probs
tensor([[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1.]], grad_fn=)
probs
tensor([[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan]], grad_fn=)

... leads to...

Traceback (most recent call last):
File "trainer.py", line 295, in
R, probs, actions, actions_idxs = model(bat)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 514, in forward
probs_, action_idxs = self.actor_net(embedded_inputs)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 374, in forward
enc_h)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 189, in forward
selections)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 255, in decode_stochastic
idxs = c.sample()
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\distributions\categorical.py", line 90, in sample
sample_2d = torch.multinomial(probs_2d, 1, True)
RuntimeError: invalid argument 2: invalid multinomial distribution (encountering
probability entry < 0) at c:\programdata\miniconda3\conda-bld\pytorch_153309062
3466\work\aten\src\th\generic/THTensorRandom.cpp:407

pemami4911 · 2018-12-21T15:13:09Z

I don't remember ever coming across this, sorry. I'm not able to investigate this myself but if you find out why this is happening, feel free to reply here and let me know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

n_epochs > 1 : multinomial sampling error probs<0 #13

n_epochs > 1 : multinomial sampling error probs<0 #13

wasd12345 commented Dec 21, 2018

pemami4911 commented Dec 21, 2018

n_epochs > 1 : multinomial sampling error probs<0 #13

n_epochs > 1 : multinomial sampling error probs<0 #13

Comments

wasd12345 commented Dec 21, 2018

pemami4911 commented Dec 21, 2018