Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporarily remove the attention model and fix pytorch_struct model. #558

Closed
wants to merge 15 commits into from

Conversation

xuzhao9
Copy link
Contributor

@xuzhao9 xuzhao9 commented Nov 11, 2021

torchtext removes the legacy dataset utilities(pytorch/text#1437), therefore we need to migrate to the new dataset API or keep the old API but copy the related code here. This PR we still use the old API because it seems non-trivial to migrate to the new API.

I will re-add the attention model in a follow-up PR (and do the quality analysis there).

Also, pytorch_struct model runs an unsupervised learning task, therefore, it does not support eval test.

batch size analysis

Batch Size GPU Time CPU Dispatch Time Walltime GPU Delta
16 53.06 52.969 53.072 -
32 85.123 75.824 85.126 0.6042781757
64 157.218 121.155 157.23 0.8469508828
128 315.678 242.213 315.681 1.007899859
256 568.102 428.017 568.098 0.7996249343

Non-idleness analysis (train, bs=128)

image

GPU is mostly idle when bs=32, so I am testing with bs=128 instead.
Data is already prefetched to the device.

@xuzhao9 xuzhao9 changed the title Remove the attention model and fix pytorch_struct model. [WIP] Remove the attention model and fix pytorch_struct model. Nov 11, 2021
@xuzhao9 xuzhao9 changed the title [WIP] Remove the attention model and fix pytorch_struct model. [WIP] Temporarily remove the attention model and fix pytorch_struct model. Nov 11, 2021
@xuzhao9 xuzhao9 changed the title [WIP] Temporarily remove the attention model and fix pytorch_struct model. Temporarily remove the attention model and fix pytorch_struct model. Nov 12, 2021
@xuzhao9 xuzhao9 requested a review from eircfb November 12, 2021 03:08
@xuzhao9
Copy link
Contributor Author

xuzhao9 commented Nov 12, 2021

Disable JIT because of the following exception:

torch.jit.frontend.UnsupportedNodeError: function definitions aren't supported:
  File "/home/circleci/project/torchbenchmark/models/pytorch_struct/networks/NeuralCFG.py", line 46
        T, NT = self.T, self.NT
    
        def terms(words):
        ~~~ <--- HERE
            b, n = input.shape[:2]
            term_prob = (

Copy link
Contributor

@eircfb eircfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got to make it work.

pytorch_struct utilization isn't great, all those tiny dispatches - might be a good place to try cudagraphs.

@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Member

@aaronenyeshi aaronenyeshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this

@facebook-github-bot
Copy link
Contributor

@xuzhao9 merged this pull request in e71b8d0.

@xuzhao9 xuzhao9 deleted the xz9/fix-broken-main branch November 12, 2021 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants