Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n_epochs > 1 : multinomial sampling error probs<0 #13

Open
wasd12345 opened this issue Dec 21, 2018 · 1 comment
Open

n_epochs > 1 : multinomial sampling error probs<0 #13

wasd12345 opened this issue Dec 21, 2018 · 1 comment

Comments

@wasd12345
Copy link

Hi, for both the sorting task and the TSP task, when I run with any n_epochs > 1, toward the beginning of the 2nd epoch I get the following error in the stochastic decoder in neural_combinatorial_rl.py related to multinomial distribution sampling. When I print out "probs" I see that they very quickly converge to all 0's and a 1 distribution, then the following iteration are all nans. Any idea what's going on here, or did you just end up avoiding this by running with 1 epoch with many iterations?

Thanks!

(using the pytorch-0.4 branch)

(below is shown for the sort_10 task)

probs
tensor([[0.0000, 0.0000, 0.0000, 0.5416, 0.4584, 0.0000],
[0.0000, 0.0000, 0.5079, 0.4921, 0.0000, 0.0000],
[0.0000, 0.5320, 0.0000, 0.0000, 0.0000, 0.4680],
[0.0000, 0.0000, 0.0000, 0.5146, 0.4854, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.5275, 0.4725],
[0.0000, 0.0000, 0.4794, 0.0000, 0.0000, 0.5206],
[0.4945, 0.5055, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.5195, 0.0000, 0.4805, 0.0000],
[0.5135, 0.0000, 0.0000, 0.0000, 0.4865, 0.0000],
[0.4877, 0.0000, 0.5123, 0.0000, 0.0000, 0.0000],
[0.5207, 0.0000, 0.0000, 0.0000, 0.0000, 0.4793],
[0.0000, 0.5231, 0.0000, 0.0000, 0.0000, 0.4769]],
grad_fn=)
probs
tensor([[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1.]], grad_fn=)
probs
tensor([[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan]], grad_fn=)

... leads to...

Traceback (most recent call last):
File "trainer.py", line 295, in
R, probs, actions, actions_idxs = model(bat)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 514, in forward
probs_, action_idxs = self.actor_net(embedded_inputs)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 374, in forward
enc_h)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 189, in forward
selections)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 255, in decode_stochastic
idxs = c.sample()
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\distributions\categorical.py", line 90, in sample
sample_2d = torch.multinomial(probs_2d, 1, True)
RuntimeError: invalid argument 2: invalid multinomial distribution (encountering
probability entry < 0) at c:\programdata\miniconda3\conda-bld\pytorch_153309062
3466\work\aten\src\th\generic/THTensorRandom.cpp:407

@pemami4911
Copy link
Owner

I don't remember ever coming across this, sorry. I'm not able to investigate this myself but if you find out why this is happening, feel free to reply here and let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants