Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TPU Spawn] RuntimeError: Cannot re-initialize CUDA in forked subprocess. #7088

Closed
kaushikb11 opened this issue Apr 18, 2021 · 0 comments · Fixed by #7074
Closed

[TPU Spawn] RuntimeError: Cannot re-initialize CUDA in forked subprocess. #7088

kaushikb11 opened this issue Apr 18, 2021 · 0 comments · Fixed by #7074
Assignees
Labels
accelerator: tpu Tensor Processing Unit bug Something isn't working help wanted Open to be worked on priority: 0 High priority task

Comments

@kaushikb11
Copy link
Contributor

🐛 Bug

  File "/home/kaushikbokka/pytorch-lightning/pytorch_lightning/core/hooks.py", line 685, in transfer_batch_to_device
    return move_data_to_device(batch, device)
  File "/home/kaushikbokka/pytorch-lightning/pytorch_lightning/utilities/apply_func.py", line 161, in move_data_to_device
    return apply_to_collection(batch, dtype=dtype, function=batch_to)
  File "/home/kaushikbokka/pytorch-lightning/pytorch_lightning/utilities/apply_func.py", line 84, in apply_to_collection
    return function(data, *args, **kwargs)
  File "/home/kaushikbokka/pytorch-lightning/pytorch_lightning/utilities/apply_func.py", line 158, in batch_to
    return data.to(device, **kwargs)
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Environment

Note: Bugs with code are solved faster ! Colab Notebook should be made public !

You can get the script and run it with:

wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py
# For security purposes, please check the contents of collect_env_details.py before running it.
python collect_env_details.py
  • PyTorch Version (e.g., 1.0):
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

@kaushikb11 kaushikb11 added bug Something isn't working help wanted Open to be worked on accelerator: tpu Tensor Processing Unit labels Apr 18, 2021
@kaushikb11 kaushikb11 self-assigned this Apr 18, 2021
@kaushikb11 kaushikb11 added the priority: 0 High priority task label Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accelerator: tpu Tensor Processing Unit bug Something isn't working help wanted Open to be worked on priority: 0 High priority task
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant