[Usage]: RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #8392

TianSongS · 2024-09-12T01:54:55Z

I used the same service deployment command, but when I upgraded from 0.5.5 to 0.6.1 today, the deployment went wrong

I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

youkaichao · 2024-09-12T01:59:44Z

should be solved by #8390

can you give it a try?

BrandonStudio · 2024-09-13T04:46:54Z

@youkaichao How can I install it? 0.6.1.post1 does not seem to be released

youkaichao · 2024-09-13T05:01:12Z

follow https://docs.vllm.ai/en/latest/getting_started/installation.html and install the wheel for the latest commit, or wait for the release, which will finish soon #8440 .

TianSongS added the usage How to use vllm label Sep 12, 2024

DarkLight1337 mentioned this issue Sep 12, 2024

Fix the AMD weight loading tests #8390

Merged

youkaichao closed this as completed in #8390 Sep 12, 2024

Provide feedback