Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-gpu on a single node #61

Open
Arrebol2020 opened this issue Jul 20, 2021 · 4 comments
Open

Multi-gpu on a single node #61

Arrebol2020 opened this issue Jul 20, 2021 · 4 comments

Comments

@Arrebol2020
Copy link

Hello, I can success run 'Single gpu on a single node', but when I try to use ‘Multi-gpu on a single node’, I get the following error:

Traceback (most recent call last):
File "/home/sevati/anaconda3/envs/cosypose/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/sevati/anaconda3/envs/cosypose/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/sevati/anaconda3/envs/cosypose/lib/python3.7/site-packages/cosypose-1.0.0-py3.7-linux-x86_64.egg/cosypose/scripts/run_cosypose_eval.py", line 16, in
from cosypose.config import EXP_DIR, MEMORY, RESULTS_DIR, LOCAL_DATA_DIR
File "/home/sevati/anaconda3/envs/cosypose/lib/python3.7/site-packages/cosypose-1.0.0-py3.7-linux-x86_64.egg/cosypose/config.py", line 33, in
assert LOCAL_DATA_DIR.exists()
AssertionError
Setting OMP and MKL num threads to 1.

Why the LOCAL_DATA_DIR is in the python3.7/site-packages/cosypose-1.0.0-py3.7-linux-x86_64.egg/cosypose/config.py not in the projects/cosypose/cosypose

@Arrebol2020
Copy link
Author

And now it change to:

RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:784, unhandled cuda error, NCCL version 2.7.8
Setting OMP and MKL num threads to 1.

@anxiaomi
Copy link

@Arrebol2020 Hello,have you solved it?

@Arrebol2020
Copy link
Author

@Arrebol2020 Hello,have you solved it?

I didn't sovle it, so I try to implement DDP by myself, it seems to work.

@kochsebastian
Copy link

@Arrebol2020 might you share your work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants