Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails on zero grad #21

Open
lemmersj opened this issue Aug 20, 2019 · 4 comments
Open

Fails on zero grad #21

lemmersj opened this issue Aug 20, 2019 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@lemmersj
Copy link
Collaborator

In instances where a neuron doesn't factor into the loss (e.g., a component of the loss is disabled for a specific experiment, resulting in a neuron or set of neurons being unused), autograd returns None for the unused connections. This results in a crash at the line:

param.grad *= 1./float(args['psuedo_batch_loop']*args['batch_size']

With the error:

TypeError: unsupported operand type(s) for *=: 'NoneType' and 'float'

This can be remedied by inserting:
if param.grad is not None:
prior to the line in question, but I'm unsure of any upstream consequences.

@lemmersj lemmersj added the bug Something isn't working label Aug 20, 2019
@natlouis
Copy link
Collaborator

That should've been fixed with issue #7 with the following line: https://github.com/MichiganCOG/ViP/blob/dev/train.py#L182.

Do you have this version from dev pulled?

@lemmersj
Copy link
Collaborator Author

I'm using an older version (apart from pulling from master, I immediately made train.py unmergeable). My mistake for missing that issue.

@lemmersj
Copy link
Collaborator Author

I came back to this --- it appears the modification in the dev branch resolves a different problem. That is, the weights that are causing and issue for me are not frozen, but have no gradient because they do not contribute to the loss.

Consider three regression nodes --- yaw, pitch, and roll. I modify training to only regress yaw by performing backpropagation on that node directly. The weights leading into the nodes for roll and pitch are left as "None" by the autograd on loss.backward(), and thus fail at the cited line.

@lemmersj lemmersj reopened this Sep 17, 2019
@natlouis natlouis self-assigned this Sep 18, 2019
@ehofesmann
Copy link
Member

Can you post your code? Training and relevant loss and model files. A github link would work.

@Byronnar Byronnar mentioned this issue Dec 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants