Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export_onnx_gpu error #1429

Closed
bboyxu5928 opened this issue Sep 3, 2022 · 3 comments
Closed

export_onnx_gpu error #1429

bboyxu5928 opened this issue Sep 3, 2022 · 3 comments
Assignees

Comments

@bboyxu5928
Copy link

when I do this operation, python export_onnx_gpu.py --config=$model_dir/train.yaml --checkpoint=$model_dir/final.pt --cmvn_file=$model_dir/global_cmvn --ctc_weight=0.5 --output_onnx_dir=$onnx_model_dir --fp16

terminate called after throwing an instance of 'c10::Error'
what(): Tried to register multiple backend fallbacks for the same dispatch key Batched; previous registration registered at /opt/conda/conda-bld/pytorch_1634272172048/work/aten/src/ATen/BatchingRegistrations.cpp:1016, new registration registered at ../aten/src/ATen/BatchingRegistrations.cpp:1016
Exception raised from registerFallback at ../aten/src/ATen/core/dispatch/Dispatcher.cpp:267 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f5ec09014b2 in /home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libc10-e6e91872.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5b (0x7f5ec08fddbb in /home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libc10-e6e91872.so)
frame #2: c10::Dispatcher::registerFallback(c10::DispatchKey, c10::KernelFunction, std::string) + 0x958 (0x7f5ec1bc1e98 in /home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libtorch_cpu-a07078e3.so)
frame #3: torch::Library::_fallback(torch::CppFunction&&) & + 0x195 (0x7f5ec1bf6f65 in /home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libtorch_cpu-a07078e3.so)
frame #4: + 0x115eb98 (0x7f5ec1acfb98 in /home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libtorch_cpu-a07078e3.so)
frame #5: + 0x1164753 (0x7f5ec1ad5753 in /home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libtorch_cpu-a07078e3.so)
frame #6: + 0xfce93f (0x7f5ec193f93f in /home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libtorch_cpu-a07078e3.so)
frame #7: + 0xf9c3 (0x7f60275ab9c3 in /lib64/ld-linux-x86-64.so.2)
frame #8: + 0x1459e (0x7f60275b059e in /lib64/ld-linux-x86-64.so.2)
frame #9: + 0xf7d4 (0x7f60275ab7d4 in /lib64/ld-linux-x86-64.so.2)
frame #10: + 0x13b8b (0x7f60275afb8b in /lib64/ld-linux-x86-64.so.2)
frame #11: + 0xfab (0x7f602717cfab in /lib64/libdl.so.2)
frame #12: + 0xf7d4 (0x7f60275ab7d4 in /lib64/ld-linux-x86-64.so.2)
frame #13: + 0x15ad (0x7f602717d5ad in /lib64/libdl.so.2)
frame #14: dlopen + 0x31 (0x7f602717d041 in /lib64/libdl.so.2)

已放弃(吐核)

how to fix it ? thanks

@yuekaizhang
Copy link
Collaborator

I have no idea about it. You may try:

  1. export PYTHONPATH=$PYTHONPATH:/your-git-clone/wenet/, rather than using wenet from pip isntall
  2. Make sure you have cuda available, maybe try it in docker image e.g. https://github.com/wenet-e2e/wenet/blob/main/runtime/gpu/Dockerfile/Dockerfile.server

@wx5223
Copy link

wx5223 commented Nov 23, 2022

meet the same error,solved by uninstall wenet by pip。
since in your error message:
/home/yjx/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet.libs/libtorch_cpu-a07078e3.so
guess this libtorch_cpu may conflict with gpu version than cause this error

@xingchensong
Copy link
Member

fixed, close this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants