Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Fix starspace #5003

Merged
merged 2 commits into from
Apr 11, 2023
Merged

Fix starspace #5003

merged 2 commits into from
Apr 11, 2023

Conversation

klshuster
Copy link
Contributor

Patch description
There is an in-place Variable error when training the current starspace models. This has to do with a known issue in using max_norm with nn.Embedding: pytorch/pytorch#26596

I have been unable to track down the root of the cause (it has something to do with accessing the embedding weights directly?) but simply removing the max_norm allows us to pass tests

Testing steps
To identify that it was an issue with the embedding, I wrapped the forward call with

with torch.autograd.set_detect_anomaly(True):

Which leads to this traceback:

  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/starspace.py", line 415, in predict
    xe, ye = self.model(xs, ys, negs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/modules.py", line 54, in forward
    c_emb = self.encoder2(c)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/modules.py", line 75, in forward
    xs_emb = self.lt(xs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 160, in forward
    return F.embedding(
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/fx/traceback.py", line 57, in format_stack
    return traceback.format_stack()
 (Triggered internally at /opt/conda/conda-bld/pytorch_1666643016022/work/torch/csrc/autograd/python_anomaly_mode.cpp:114.)

You can see that the error originates from xs_emb = self.lt(xs)

Interestingly, the error seems to only come about when embedding candidate vectors -- NOT THE INPUT -- so there might be something going on there too.

Anyway, tests pass after this, I believe

cc @jaseweston for whether max_norm is required

Copy link
Contributor

@mojtaba-komeili mojtaba-komeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it for now to pass the tests.

@mojtaba-komeili mojtaba-komeili merged commit 49ecfa9 into main Apr 11, 2023
@mojtaba-komeili mojtaba-komeili deleted the fix_starspace branch April 11, 2023 15:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants