Skip to content
This repository was archived by the owner on Feb 12, 2022. It is now read-only.

Model crashes under pytorch 0.4 #39

Closed
zou3519 opened this issue Apr 20, 2018 · 5 comments
Closed

Model crashes under pytorch 0.4 #39

zou3519 opened this issue Apr 20, 2018 · 5 comments

Comments

@zou3519
Copy link

zou3519 commented Apr 20, 2018

Hi,
The folks over at pytorch are working on cutting a new 0.4 release. We'd like to make the transition as smooth as possible (if you were planning on upgrading), so we've been testing a number of community repos.

I ran a model and it errors out due to a change in pytorch. Minimal repro:

# Install pytorch-nightly (Currently our pre-release branch)
conda install pytorch-nightly -c pytorch

# Get data
./getdata.sh

# Run model
python main.py --batch_size 20 --data data/penn --dropouti 0.4 --dropouth 0.25 --seed 141 --epoch 1 && \
python -u main.py --model QRNN --batch_size 20 --clip 0.2 --wdrop 0.1 --nhid 1550 --nlayers 4 --emsize 400 --dropouth 0.3 --seed 9001 --dropouti 0.4 --epochs
1

Stack trace: https://gist.github.com/zou3519/142d48df1c03db9fe9c11717ad9a59f2

Pytorch 0.4 adds zero-dimensional tensors that cannot be iterated over, which seems to be what the error is complaining about. Changing

return tuple(repackage_hidden(v) for v in h)
in particular to handle this case should fix it.

cc @soumith

@guillaume-chevalier
Copy link

guillaume-chevalier commented May 4, 2018

@zou3519 @Smerity I get this error once I attempt to correct the bug:

ubuntu@ip-172-31-72-29:~/awd-lstm-lm$ python3 -u main.py --epochs 500 --data data/wikitext-2 --clip 0.25 --dropouti 0.4 --dropouth 0.2 --nhid 1550 --nlayers 4 --seed 4002 --model QRNN --wdrop 0.1 --batch_size 40 --save WT2.pt
Loading cached dataset...
Applying weight drop of 0.1 to weight
Applying weight drop of 0.1 to weight
Applying weight drop of 0.1 to weight
Applying weight drop of 0.1 to weight
[QRNNLayer(
  (linear): WeightDrop(
    (module): Linear(in_features=800, out_features=4650, bias=True)
  )
), QRNNLayer(
  (linear): WeightDrop(
    (module): Linear(in_features=1550, out_features=4650, bias=True)
  )
), QRNNLayer(
  (linear): WeightDrop(
    (module): Linear(in_features=1550, out_features=4650, bias=True)
  )
), QRNNLayer(
  (linear): WeightDrop(
    (module): Linear(in_features=1550, out_features=1200, bias=True)
  )
)]
Using []
Args: Namespace(alpha=2, batch_size=40, beta=1, bptt=70, clip=0.25, cuda=True, data='data/wikitext-2', dropout=0.4, dropoute=0.1, dropouth=0.2, dropouti=0.4, emsize=400, epochs=500, log_interval=200, lr=30, model='QRNN', nhid=1550, nlayers=4, nonmono=5, optimizer='sgd', resume='', save='WT2.pt', seed=4002, tied=True, wdecay=1.2e-06, wdrop=0.1, when=[-1])
Model total parameters: 33354628
Traceback (most recent call last):
  File "main.py", line 241, in <module>
    train()
  File "main.py", line 197, in train
    output, hidden, rnn_hs, dropped_rnn_hs = model(data, hidden, return_h=True)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/awd-lstm-lm/model.py", line 70, in forward
    emb = embedded_dropout(self.encoder, input, dropout=self.dropoute if self.training else 0)
  File "/home/ubuntu/awd-lstm-lm/embed_regularize.py", line 19, in embedded_dropout
    X = embed._backend.Embedding.apply(words, masked_embed_weight,
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/backends/backend.py", line 10, in __getattr__
    raise NotImplementedError
NotImplementedError

Here is how I attempted to fix the broken method:

def repackage_hidden(h):
    """Wraps hidden states in new Variables, to detach them from their history."""
    if type(h) == Variable or (type(h) == Tensor and len(h.size()) == 0):
        return Variable(h.data)
    else:
        return tuple(repackage_hidden(v) for v in h)

Note that I tried for every dimension sizes in len(h.size()) == 0), len(h.size()) == 1) and len(h.size()) == 2). Any idea on what's going on?

P.S. you can reuse/modify/license my pasted code above without any restrictions whatsoever.

@shawntan
Copy link

shawntan commented May 5, 2018

Some fixes with #43

This is how I fixed the issue with repackage_hidden:

def repackage_hidden(h):
    """Wraps hidden states in new Tensors,
    to detach them from their history."""
    if isinstance(h, torch.Tensor):
        return h.detach()
     else:
         return tuple(repackage_hidden(v) for v in h)

For the issue in embed_regularize.py:
Replace:

    X = embed._backend.Embedding.apply(words, masked_embed_weight,
                                       padding_idx, embed.max_norm, embed.norm_type,
                                       embed.scale_grad_by_freq, embed.sparse
                                       )

with:

    X = F.embedding(
        words, masked_embed_weight,
        padding_idx,
        embed.max_norm, embed.norm_type,
        embed.scale_grad_by_freq, embed.sparse
    )

@puttkraidej
Copy link

It works, thanks @shawntan

@keskarnitish
Copy link
Contributor

Thanks for this! Let me look at this carefully and merge it once I run some tests.

@keskarnitish
Copy link
Contributor

Thanks everyone for bringing this to our attention, and to @shawntan for proposing fixes. This should be fixed in 441e122 . Closing this issue now, please feel free to reopen as necessary.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants