Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA version #5

Open
kdheejb7 opened this issue Sep 23, 2021 · 2 comments
Open

CUDA version #5

kdheejb7 opened this issue Sep 23, 2021 · 2 comments

Comments

@kdheejb7
Copy link

Hello,

Can you please let me know what version of cuda you used?

The first case, I used torch==0.4.1 and cuda 10.0.
When I run the command python3 scripts/train_rpn_3d.py --config=kitti_3d_base --exp_name base
I got the following error

  File "scripts/train_rpn_3d.py", line 324, in <module>
    main(args)
  File "scripts/train_rpn_3d.py", line 140, in main
    rpn_net, optimizer = init_training_model(conf, paths.output)
  File "/workspace/lib/core.py", line 69, in init_training_model
    network = absolute_import(dst_path)
  File "/workspace/lib/util.py", line 98, in absolute_import
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/workspace/output/base/20210923_152833/M3d_inference_align.py", line 5, in <module>
    from model.pose_dla_dcn import DLASeg, DeformConv
  File "/workspace/model/pose_dla_dcn.py", line 17, in <module>
    from .DCNv2.dcn_v2 import DCN
  File "/workspace/model/DCNv2/dcn_v2.py", line 11, in <module>
    from .dcn_v2_func import DCNv2Function
  File "/workspace/model/DCNv2/dcn_v2_func.py", line 9, in <module>
    from ._ext import dcn_v2 as _backend
  File "/workspace/model/DCNv2/_ext/dcn_v2/__init__.py", line 3, in <module>
    from ._dcn_v2 import lib as _lib, ffi as _ffi
ImportError: /workspace/model/DCNv2/_ext/dcn_v2/_dcn_v2.so: undefined symbol: __cudaPopCallConfiguration

I knew that this error is because of cuda and torch version mismatch, so I thought if I change the cuda version from 10.0 to 9.2, I can solve this error.

But I have another problem with cuda 9.2

The second case, I used torch==0.4.1 and cuda 9.2
When I run the command python3 scripts/train_rpn_3d.py --config=kitti_3d_base --exp_name base
I got the following error

Traceback (most recent call last):
  File "scripts/train_rpn_3d.py", line 23, in <module>
    from lib.imdb_util import *
  File "/workspace/lib/imdb_util.py", line 24, in <module>
    from lib.rpn_util import *
  File "/workspace/lib/rpn_util.py", line 16, in <module>
    from lib.nms.gpu_nms import gpu_nms
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory

This error occurred before reaching the code line that caused the first case error.
So I cannot solve the problem about the command for training.
I tried to use cuda 9.2 and 10.0, but each caused one problem.

Can you please let me know what version of cuda you used?

Thank you!

@Wasiiiii
Copy link

Wasiiiii commented Oct 4, 2021

HY kdheejb7
Do u solve the version problem??

@123456789live
Copy link

你好,请问版本问题你解决了吗。我也有相似的问题。 @kdheejb7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants