No CUDA GPUs are available #29

gwyllo · 2024-10-17T00:04:25Z

I can successfully install all dependencies following the instruction sets in the repo. I have also tried the same using the docker image.

I get the following error when trying to run the run_train_infer.sh script using either install from scratch or the docker image:

(instantsplat) root@C.13193616:/InstantSplat$ bash scripts/run_train_infer.sh
========= santorini: Dust3r_coarse_geometric_initialization =========
... loading model from submodules/dust3r/checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth
instantiating : AsymmetricCroCo3DStereo(enc_depth=24, dec_depth=12, enc_embed_dim=1024, dec_embed_dim=768, enc_num_heads=16, dec_num_heads=12, pos_embed='RoPE100', patch_embed_cls='PatchEmbedDust3R', img_size=(512, 512), head_type='dpt', output_mode='pts3d', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), landscape_only=False)
<All keys matched successfully>
Traceback (most recent call last):
  File "/InstantSplat/./coarse_init_infer.py", line 53, in <module>
    model = AsymmetricCroCo3DStereo.from_pretrained(model_path).to(device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1326, in convert
    return t.to(
           ^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
========= santorini: Train: jointly optimize pose =========
Optimizing ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Output folder: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./train_joint.py", line 279, in <module>
    training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from, args)
  File "/InstantSplat/./train_joint.py", line 60, in training
    scene = Scene(dataset, gaussians, opt=args, shuffle=True)                                                                      
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/InstantSplat/scene/__init__.py", line 49, in __init__
    assert False, "Could not recognize scene type!"
           ^^^^^
AssertionError: Could not recognize scene type!
========= santorini: Render interpolated pose & output video =========
Looking for config file in ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Config file found: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Rendering ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./render_by_interp.py", line 143, in <module>
    render_sets(
  File "/InstantSplat/./render_by_interp.py", line 98, in render_sets
    save_interpolate_pose(dataset.model_path, iteration, args.n_views)
  File "/InstantSplat/./render_by_interp.py", line 33, in save_interpolate_pose
    org_pose = np.load(model_path + f"pose/pose_{iter}.npy")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py", line 455, in load
    fid = stack.enter_context(open(os.fspath(file), "rb"))
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/pose/pose_1000.npy'

nvcc --version suggests cuda is installed correctly

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

and a simple script to check if cuda is accessible to pytorch also seems to work as expected within this conda environment:

import torch
print (torch.cuda.is_available())
print(torch.version.cuda)
print(torch.cuda.device_count())

(instantsplat) root@C.13193616:/$ python torchCheck.py
True
12.1
1

Any idea on the underlying cause of this issue?

The text was updated successfully, but these errors were encountered:

dafnianagno · 2024-10-23T13:24:46Z

I think you have to change the GPU_ID (3rd line of the bash script) to 0, since you have only one GPU.

smart4654154 · 2024-11-08T09:30:45Z

I can successfully install all dependencies following the instruction sets in the repo. I have also tried the same using the docker image.

I get the following error when trying to run the run_train_infer.sh script using either install from scratch or the docker image:

(instantsplat) root@C.13193616:/InstantSplat$ bash scripts/run_train_infer.sh
========= santorini: Dust3r_coarse_geometric_initialization =========
... loading model from submodules/dust3r/checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth
instantiating : AsymmetricCroCo3DStereo(enc_depth=24, dec_depth=12, enc_embed_dim=1024, dec_embed_dim=768, enc_num_heads=16, dec_num_heads=12, pos_embed='RoPE100', patch_embed_cls='PatchEmbedDust3R', img_size=(512, 512), head_type='dpt', output_mode='pts3d', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), landscape_only=False)
<All keys matched successfully>
Traceback (most recent call last):
  File "/InstantSplat/./coarse_init_infer.py", line 53, in <module>
    model = AsymmetricCroCo3DStereo.from_pretrained(model_path).to(device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1326, in convert
    return t.to(
           ^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
========= santorini: Train: jointly optimize pose =========
Optimizing ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Output folder: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./train_joint.py", line 279, in <module>
    training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from, args)
  File "/InstantSplat/./train_joint.py", line 60, in training
    scene = Scene(dataset, gaussians, opt=args, shuffle=True)                                                                      
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/InstantSplat/scene/__init__.py", line 49, in __init__
    assert False, "Could not recognize scene type!"
           ^^^^^
AssertionError: Could not recognize scene type!
========= santorini: Render interpolated pose & output video =========
Looking for config file in ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Config file found: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args
Rendering ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/
Traceback (most recent call last):
  File "/InstantSplat/./render_by_interp.py", line 143, in <module>
    render_sets(
  File "/InstantSplat/./render_by_interp.py", line 98, in render_sets
    save_interpolate_pose(dataset.model_path, iteration, args.n_views)
  File "/InstantSplat/./render_by_interp.py", line 33, in save_interpolate_pose
    org_pose = np.load(model_path + f"pose/pose_{iter}.npy")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py", line 455, in load
    fid = stack.enter_context(open(os.fspath(file), "rb"))
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/pose/pose_1000.npy'

nvcc --version suggests cuda is installed correctly

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

and a simple script to check if cuda is accessible to pytorch also seems to work as expected within this conda environment:

import torch
print (torch.cuda.is_available())
print(torch.version.cuda)
print(torch.cuda.device_count())

(instantsplat) root@C.13193616:/$ python torchCheck.py
True
12.1
1

Any idea on the underlying cause of this issue?

hi,have you solve this issue?i have met same issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No CUDA GPUs are available #29

No CUDA GPUs are available #29

gwyllo commented Oct 17, 2024

dafnianagno commented Oct 23, 2024

smart4654154 commented Nov 8, 2024

No CUDA GPUs are available #29

No CUDA GPUs are available #29

Comments

gwyllo commented Oct 17, 2024

dafnianagno commented Oct 23, 2024

smart4654154 commented Nov 8, 2024