-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporarily remove the attention model and fix pytorch_struct model. #558
Conversation
Disable JIT because of the following exception:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got to make it work.
pytorch_struct utilization isn't great, all those tiny dispatches - might be a good place to try cudagraphs.
@xuzhao9 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this
torchtext removes the legacy dataset utilities(pytorch/text#1437), therefore we need to migrate to the new dataset API or keep the old API but copy the related code here. This PR we still use the old API because it seems non-trivial to migrate to the new API.
I will re-add the attention model in a follow-up PR (and do the quality analysis there).
Also, pytorch_struct model runs an unsupervised learning task, therefore, it does not support eval test.
batch size analysis
Non-idleness analysis (train, bs=128)
GPU is mostly idle when bs=32, so I am testing with bs=128 instead.
Data is already prefetched to the device.