-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataParallel DeepLift Fixes #335
Conversation
Updated to use 1 hook instead of 2. |
Awesome! Thank you! I'll take a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for making it work with one additional hook!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vivekmig has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: This PR adds only the bug fixes identified from refactoring and adding dynamic tests. 1. Additional forward arguments were not working appropriately with DeepLift on a DataParallel model. This is because the device split of expanded additional forward args didn't necessarily match that of inputs. The behavior has been changed to expand the additional args in the hook function (after device split), which ensures the additional args and inputs remain matched. 2. Different targets per example was not working appropriately with DeepLift on a DataParallel model. This is because the model output concatenated the outputs of the devices in DataParallel, which mixed input / baseline outputs, inhibiting appropriate matching between input example and target. Additional forward hooks have been added to appropriately return the output with all inputs followed by all baselines. 3. GradCAM is primarily intended for layers with >= 3 dimensions, since it computes average gradient for each example / channel. For layers with 2 dimensions, the mean gradient over all dimensions was being taken. This has been updated to use the layer gradients directly in this case, which better aligns with the behavior for >= 3 dimensions. 4. DeepLiftShap (and Neuron / Layer variants) were incorrectly repeating additional forward args, this has been fixed to use repeat interleave instead. Pull Request resolved: pytorch#335 Reviewed By: edward-io Differential Revision: D20844511 Pulled By: vivekmig fbshipit-source-id: c895b348c3d5c56355c39d429947f2f36dda37a7
This PR adds only the bug fixes identified from refactoring and adding dynamic tests.