Fix starspace #5003

klshuster · 2023-04-06T16:29:18Z

Patch description
There is an in-place Variable error when training the current starspace models. This has to do with a known issue in using max_norm with nn.Embedding: pytorch/pytorch#26596

I have been unable to track down the root of the cause (it has something to do with accessing the embedding weights directly?) but simply removing the max_norm allows us to pass tests

Testing steps
To identify that it was an issue with the embedding, I wrapped the forward call with

with torch.autograd.set_detect_anomaly(True):

Which leads to this traceback:

  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/starspace.py", line 415, in predict
    xe, ye = self.model(xs, ys, negs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/modules.py", line 54, in forward
    c_emb = self.encoder2(c)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/modules.py", line 75, in forward
    xs_emb = self.lt(xs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 160, in forward
    return F.embedding(
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/fx/traceback.py", line 57, in format_stack
    return traceback.format_stack()
 (Triggered internally at /opt/conda/conda-bld/pytorch_1666643016022/work/torch/csrc/autograd/python_anomaly_mode.cpp:114.)

You can see that the error originates from xs_emb = self.lt(xs)

Interestingly, the error seems to only come about when embedding candidate vectors -- NOT THE INPUT -- so there might be something going on there too.

Anyway, tests pass after this, I believe

cc @jaseweston for whether max_norm is required

mojtaba-komeili

Let's merge it for now to pass the tests.

klshuster added 2 commits April 6, 2023 12:01

fix

0d0a89f

Fix starspace pt2

4aefc7c

klshuster requested review from jaseweston and mojtaba-komeili April 6, 2023 16:29

facebook-github-bot added the CLA Signed label Apr 6, 2023

mojtaba-komeili mentioned this pull request Apr 7, 2023

Fixing the unittests 3.8 #5002

Merged

mojtaba-komeili approved these changes Apr 11, 2023

View reviewed changes

mojtaba-komeili merged commit 49ecfa9 into main Apr 11, 2023

mojtaba-komeili deleted the fix_starspace branch April 11, 2023 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix starspace #5003

Fix starspace #5003

klshuster commented Apr 6, 2023

mojtaba-komeili left a comment

Fix starspace #5003

Fix starspace #5003

Conversation

klshuster commented Apr 6, 2023

mojtaba-komeili left a comment

Choose a reason for hiding this comment