Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No CUDA GPUs are available #29

Open
gwyllo opened this issue Oct 17, 2024 · 2 comments
Open

No CUDA GPUs are available #29

gwyllo opened this issue Oct 17, 2024 · 2 comments

Comments

@gwyllo
Copy link

gwyllo commented Oct 17, 2024

I can successfully install all dependencies following the instruction sets in the repo. I have also tried the same using the docker image.

I get the following error when trying to run the run_train_infer.sh script using either install from scratch or the docker image:

(instantsplat) root@C.13193616:/InstantSplat$ bash scripts/run_train_infer.sh
========= santorini: Dust3r_coarse_geometric_initialization =========
... loading model from submodules/dust3r/checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth
instantiating : AsymmetricCroCo3DStereo(enc_depth=24, dec_depth=12, enc_embed_dim=1024, dec_embed_dim=768, enc_num_heads=16, dec_num_heads=12, pos_embed='RoPE100', patch_embed_cls='PatchEmbedDust3R', img_size=(512, 512), head_type='dpt', output_mode='pts3d', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), landscape_only=False)
<All keys matched successfully>
Traceback (most recent call last):
  File "/InstantSplat/./coarse_init_infer.py", line 53, in <module>
    model = AsymmetricCroCo3DStereo.from_pretrained(model_path).to(device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1326, in convert
    return t.to(
           ^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
========= santorini: Train: jointly optimize pose =========
Optimizing ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Output folder: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./train_joint.py", line 279, in <module>
    training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from, args)
  File "/InstantSplat/./train_joint.py", line 60, in training
    scene = Scene(dataset, gaussians, opt=args, shuffle=True)                                                                      
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/InstantSplat/scene/__init__.py", line 49, in __init__
    assert False, "Could not recognize scene type!"
           ^^^^^
AssertionError: Could not recognize scene type!
========= santorini: Render interpolated pose & output video =========
Looking for config file in ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Config file found: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Rendering ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./render_by_interp.py", line 143, in <module>
    render_sets(
  File "/InstantSplat/./render_by_interp.py", line 98, in render_sets
    save_interpolate_pose(dataset.model_path, iteration, args.n_views)
  File "/InstantSplat/./render_by_interp.py", line 33, in save_interpolate_pose
    org_pose = np.load(model_path + f"pose/pose_{iter}.npy")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py", line 455, in load
    fid = stack.enter_context(open(os.fspath(file), "rb"))
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/pose/pose_1000.npy'

nvcc --version suggests cuda is installed correctly

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

and a simple script to check if cuda is accessible to pytorch also seems to work as expected within this conda environment:

import torch
print (torch.cuda.is_available())
print(torch.version.cuda)
print(torch.cuda.device_count())
(instantsplat) root@C.13193616:/$ python torchCheck.py
True
12.1
1

Any idea on the underlying cause of this issue?

@dafnianagno
Copy link

I think you have to change the GPU_ID (3rd line of the bash script) to 0, since you have only one GPU.

@smart4654154
Copy link

I can successfully install all dependencies following the instruction sets in the repo. I have also tried the same using the docker image.

I get the following error when trying to run the run_train_infer.sh script using either install from scratch or the docker image:

(instantsplat) root@C.13193616:/InstantSplat$ bash scripts/run_train_infer.sh
========= santorini: Dust3r_coarse_geometric_initialization =========
... loading model from submodules/dust3r/checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth
instantiating : AsymmetricCroCo3DStereo(enc_depth=24, dec_depth=12, enc_embed_dim=1024, dec_embed_dim=768, enc_num_heads=16, dec_num_heads=12, pos_embed='RoPE100', patch_embed_cls='PatchEmbedDust3R', img_size=(512, 512), head_type='dpt', output_mode='pts3d', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), landscape_only=False)
<All keys matched successfully>
Traceback (most recent call last):
  File "/InstantSplat/./coarse_init_infer.py", line 53, in <module>
    model = AsymmetricCroCo3DStereo.from_pretrained(model_path).to(device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1326, in convert
    return t.to(
           ^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
========= santorini: Train: jointly optimize pose =========
Optimizing ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Output folder: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./train_joint.py", line 279, in <module>
    training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from, args)
  File "/InstantSplat/./train_joint.py", line 60, in training
    scene = Scene(dataset, gaussians, opt=args, shuffle=True)                                                                      
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/InstantSplat/scene/__init__.py", line 49, in __init__
    assert False, "Could not recognize scene type!"
           ^^^^^
AssertionError: Could not recognize scene type!
========= santorini: Render interpolated pose & output video =========
Looking for config file in ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Config file found: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Rendering ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./render_by_interp.py", line 143, in <module>
    render_sets(
  File "/InstantSplat/./render_by_interp.py", line 98, in render_sets
    save_interpolate_pose(dataset.model_path, iteration, args.n_views)
  File "/InstantSplat/./render_by_interp.py", line 33, in save_interpolate_pose
    org_pose = np.load(model_path + f"pose/pose_{iter}.npy")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py", line 455, in load
    fid = stack.enter_context(open(os.fspath(file), "rb"))
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/pose/pose_1000.npy'

nvcc --version suggests cuda is installed correctly

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

and a simple script to check if cuda is accessible to pytorch also seems to work as expected within this conda environment:

import torch
print (torch.cuda.is_available())
print(torch.version.cuda)
print(torch.cuda.device_count())
(instantsplat) root@C.13193616:/$ python torchCheck.py
True
12.1
1

Any idea on the underlying cause of this issue?

hi,have you solve this issue?i have met same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants