CI: Linux CUDA版のバイナリビルド成果物の共有ライブラリがシンボリックリンクになっておりGPU動作時にエラーになる #699

aoirint · 2023-06-19T14:33:34Z

不具合の内容

#696 の作業中に気づいたものを、エラーログをつけてIssue化しておきます（原因と思われるもの・対応方針も出ています）。

現象・ログ

エラーログ

Warning: cpu_num_threads is set to 0. ( The library leaves the decision to the synthesis runtime )
INFO:     Started server process [713]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:50021 (Press CTRL+C to quit)
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "uvicorn/protocols/http/h11_impl.py", line 373, in run_asgi
  File "uvicorn/middleware/proxy_headers.py", line 75, in __call__
  File "fastapi/applications.py", line 208, in __call__
  File "starlette/applications.py", line 112, in __call__
  File "starlette/middleware/errors.py", line 181, in __call__
  File "starlette/middleware/errors.py", line 159, in __call__
  File "starlette/middleware/base.py", line 53, in __call__
  File "anyio/_backends/_asyncio.py", line 662, in __aexit__
  File "starlette/middleware/base.py", line 30, in coro
  File "starlette/middleware/cors.py", line 84, in __call__
  File "starlette/exceptions.py", line 82, in __call__
  File "starlette/exceptions.py", line 71, in __call__
  File "starlette/routing.py", line 656, in __call__
  File "starlette/routing.py", line 259, in handle
  File "starlette/routing.py", line 61, in app
  File "fastapi/routing.py", line 226, in app
  File "fastapi/routing.py", line 161, in run_endpoint_function
  File "starlette/concurrency.py", line 39, in run_in_threadpool
  File "anyio/to_thread.py", line 31, in run_sync
  File "anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio/_backends/_asyncio.py", line 867, in run
  File "run.py", line 223, in audio_query
  File "voicevox_engine/synthesis_engine/synthesis_engine_base.py", line 183, in create_accent_phrases
  File "voicevox_engine/synthesis_engine/synthesis_engine_base.py", line 168, in replace_mora_data
  File "voicevox_engine/synthesis_engine/synthesis_engine.py", line 219, in replace_phoneme_length
  File "voicevox_engine/synthesis_engine/synthesis_engine.py", line 192, in initialize_speaker_synthesis
  File "voicevox_engine/synthesis_engine/core_wrapper.py", line 526, in load_model
  File "voicevox_engine/synthesis_engine/core_wrapper.py", line 536, in assert_core_success
voicevox_engine.synthesis_engine.core_wrapper.CoreError: modelデータ読み込みに失敗しました (/content/linux-nvidia/model/d0.bin): Failed to create session options: Error calling ONNX Runtime C function: /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1018 void onnxruntime::ProviderSharedLibrary::Ensure() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_shared.so with error: libonnxruntime_providers_shared.so: cannot open shared object file: No such file or directory

INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [713]

ファイル容量をみると、
0.14.5では、Linux CUDA版は1.5GBほどありますが、
0.15.0-checkshellbash.0や0.15.0-aoirint.18では900MBほどになっています。

同梱している共有ライブラリの一部が実体ではなく、CI環境のシンボリックリンクになっていることが動作しない原因と思われます。

libcublasLt.so.11 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cuda/bin/libcublasLt.so.11
libcublas.so.11 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cuda/bin/libcublas.so.11
libcudart.so.11.0 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cuda/bin/libcudart.so.11.0
libcudnn_adv_infer.so.8 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cudnn/bin/libcudnn_adv_infer.so.8
libcudnn_cnn_infer.so.8 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cudnn/bin/libcudnn_cnn_infer.so.8
libcudnn_ops_infer.so.8 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cudnn/bin/libcudnn_ops_infer.so.8
libcudnn.so.8 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cudnn/bin/libcudnn.so.8
libcufft.so.10 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cuda/bin/libcufft.so.10
libcurand.so.10 -> /home/runner/work/voicevox_engine/voicevox_engine/download/cuda/bin/libcurand.so.10
libonnxruntime_providers_cuda.so -> /home/runner/work/voicevox_engine/voicevox_engine/download/onnxruntime/lib/libonnxruntime_providers_cuda.so
libonnxruntime_providers_shared.so -> /home/runner/work/voicevox_engine/voicevox_engine/download/onnxruntime/lib/libonnxruntime_providers_shared.so
libonnxruntime_providers_tensorrt.so -> /home/runner/work/voicevox_engine/voicevox_engine/download/onnxruntime/lib/libonnxruntime_providers_tensorrt.so

再現手順

リリースを作成し、Linux CUDA版のバイナリ（voicevox_engine-linux-nvidia-*.7z.*）をダウンロード、--use_gpuをつけて実行し、音声を生成しようとする。

または、Google Colaboratoryで以下のJupyterノートブックを実行する。

https://gist.github.com/aoirint/bfa020f4537788a7a7989a57246d87c9/a77ed60985f758adf4a37ff9a5a565977824bc52

期待動作

GPU動作時にエラーにならず、音声が生成される。

VOICEVOXのバージョン

https://github.com/VOICEVOX/voicevox_engine/tree/1c10fe07a837d94b2886884e259b407683c0b992

OSの種類/ディストリ/バージョン

Windows
macOS
Linux

その他

The text was updated successfully, but these errors were encountered:

Hiroshiba · 2023-06-19T21:48:35Z

わかりやすいissueの作成、ありがとうございます！！

aoirint added the バグ label Jun 19, 2023

github-actions bot added the OS 依存：linux Linux に依存した現象 label Jun 19, 2023

aoirint mentioned this issue Jun 19, 2023

CI: バイナリビルド成果物の構成時に共有ライブラリの実体をmvするように変更 #700

Merged

Hiroshiba closed this as completed in #700 Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Linux CUDA版のバイナリビルド成果物の共有ライブラリがシンボリックリンクになっておりGPU動作時にエラーになる #699

CI: Linux CUDA版のバイナリビルド成果物の共有ライブラリがシンボリックリンクになっておりGPU動作時にエラーになる #699

aoirint commented Jun 19, 2023 •

edited

Loading

Hiroshiba commented Jun 19, 2023

CI: Linux CUDA版のバイナリビルド成果物の共有ライブラリがシンボリックリンクになっておりGPU動作時にエラーになる #699

CI: Linux CUDA版のバイナリビルド成果物の共有ライブラリがシンボリックリンクになっておりGPU動作時にエラーになる #699

Comments

aoirint commented Jun 19, 2023 • edited Loading

不具合の内容

現象・ログ

再現手順

期待動作

VOICEVOXのバージョン

OSの種類/ディストリ/バージョン

その他

Hiroshiba commented Jun 19, 2023

aoirint commented Jun 19, 2023 •

edited

Loading