docker运行报错：multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #305

syusama · 2024-08-14T16:39:12Z

提交前必须检查以下项目 | The following items must be checked before submission

请确保使用的是仓库最新代码（git pull），一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

Ubuntu上docker-compose部署Qwen2-7B-Instruct报错

(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226]   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 123, in init_device
(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226]     torch.cuda.set_device(self.device)
(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226]   File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 420, in set_device
(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226]     torch._C._cuda_setDevice(device)
(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226]   File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 300, in _lazy_init
(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226]     raise RuntimeError(
(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
(VllmWorkerProcess pid=294) ERROR 08-14 16:36:35 multiproc_worker_utils.py:226]

其中关键信息是
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Dependencies

# 请在此处粘贴依赖情况
# Please paste the dependencies here

运行日志或截图 | Runtime logs or screenshots

# 请在此处粘贴运行日志
# Please paste the run log here

The text was updated successfully, but these errors were encountered:

xusenlinzy · 2024-08-15T08:03:39Z

单卡可以启动吗

syusama · 2024-08-16T05:52:20Z

单卡可以启动吗

单卡的话不会报上面这个错误了，但是有新的错误，报显存不足
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 6.30 GiB. GPU 0 has a total capacity of 23.64 GiB of which 4.28 GiB is free. Process 9684 has 19.35 GiB memory in use. Of the allocated memory 18.84 GiB is allocated by PyTorch, and 56.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 5116 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 21666 C python 1638MiB |
| 1 N/A N/A 5116 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 5116 G /usr/lib/xorg/Xorg 4MiB |
| 3 N/A N/A 5116 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------------------+
`
这是显卡占用情况，尝试过降低MAX_NUM_SEQS，以及运行torch.cuda.empty_cache()清空显存，可还是始终报这个错

syusama · 2024-08-21T02:58:19Z

单独部署llm没有问题，但是同时部署llm和embedding就会报这个错，新开一个issue：#308

syusama closed this as completed Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker运行报错：multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #305

docker运行报错：multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #305

syusama commented Aug 14, 2024 •

edited

Loading

xusenlinzy commented Aug 15, 2024

syusama commented Aug 16, 2024

syusama commented Aug 21, 2024

docker运行报错：multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #305

docker运行报错：multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #305

Comments

syusama commented Aug 14, 2024 • edited Loading

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

操作系统 | Operating system

详细描述问题 | Detailed description of the problem

Dependencies

运行日志或截图 | Runtime logs or screenshots

xusenlinzy commented Aug 15, 2024

syusama commented Aug 16, 2024

syusama commented Aug 21, 2024

syusama commented Aug 14, 2024 •

edited

Loading