Skip to content

Multi-GPU autograd error with Pytorch 0.4 #7092

Closed
@erogol

Description

@erogol

After updating pytorch 0.4 I am getting the following error when I try to train my model here: https://github.com/mozilla/TTS with multi-gpus. I have no idea about what it means unfortunately. A bug or just a problem that I need some feedback on. Thx.

Traceback (most recent call last):
  File "train.py", line 403, in <module>
    main(args)
  File "train.py", line 393, in main
    model, criterion, train_loader, optimizer, epoch)
  File "train.py", line 111, in train
    model.forward(text_input, mel_spec)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 114, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 124, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 65, in parallel_apply
    raise output
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 41, in _worker
    output = module(*input, **kwargs)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/erogol/projects/TTS/models/tacotron.py", line 28, in forward
    encoder_outputs = self.encoder(inputs)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/erogol/projects/TTS/layers/tacotron.py", line 205, in forward
    return self.cbhg(inputs)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/erogol/projects/TTS/layers/tacotron.py", line 183, in forward
    outputs, _ = self.gru(x)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 192, in forward
    output, hidden = func(input, self.all_weights, hx, batch_sizes)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/_functions/rnn.py", line 323, in forward
    return func(input, *fargs, **fkwargs)
  File "/home/erogol/miniconda3/envs/pytorch4/lib/python3.6/site-packages/torch/nn/_functions/rnn.py", line 287, in forward
    dropout_ts)
RuntimeError: torch/csrc/autograd/variable.cpp:115: get_grad_fn: Assertion `output_nr == 0` failed.

Metadata

Metadata

Assignees

Labels

todoNot as important as medium or high priority tasks, but we will work on these.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions