- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
Description
Motivation.
We have suffered a lot from the module pynvml recently, see #12847 for example.
libnvml.so is the library behind nvidia-smi, and pynvml is a Python wrapper around it. We use it to get GPU status without initializing CUDA context in the current process.
Historically, there are two packages that provide a module named pynvml:
- nvidia-ml-py(https://pypi.org/project/nvidia-ml-py/): The official wrapper. It is a dependency of vLLM, and is installed when users install vLLM. It provides a Python module named- pynvml.
- pynvml(https://pypi.org/project/pynvml/): An unofficial wrapper. Prior to version 12.0, it also provides a Python module- pynvml, and therefore conflicts with the official one. What's worse, the module is a Python package, and has higher priority than the official one which is a standalone Python file. This causes errors when both of them are installed. Starting from version 12.0, it migrates to a new module named- pynvml_utilsto avoid the conflict.
To make vLLM work, we have to make sure, there's no pynvml package, or the pynvml package has version 12.0 or higher. However, neither of them is a doable solution:
- As a Python package, we cannot ask people to uninstall pynvmljust to make vLLM work.
- If we pin pynvml==12.0as vLLM's dependency, then it can work for vLLM, but will break other libraries. Notably, deepspeed depends onpynvml==11.5.0: https://github.com/ray-project/ray/blob/9e3ec5972cd952d2b50f3b20abc24ced5abb8b54/python/requirements_compiled.txt#L1611 The module is so confusing, that lots of community libraries don't knownvidia-ml-pyis the official one. Lots of community libraries dependspynvml, e.g. https://github.com/Sygil-Dev/sygil-webui/blob/d88fa9e8c4d9cefbbfb0b445ad79d4ddb85c8e36/requirements.txt#L17 . What's worse, even nvidia official containernvcr.io/nvidia/pytorch:25.01-py3uses the unofficialpynvml<12.0.
To summarize, we are in a dependency hell due to the historical confusing packages.
Proposed Change.
To solve the problem, I propose to copy the code from nvidia-ml-py into vLLM, and use vllm.third_party.pynvml to import it. See #12963 for the prototype.
The solution is only to rescue us from the dependency hell. We don't need to maintain the code. If there are bugfixes in nvidia-ml-py in the future, we can periodically sync the code.
This is the first time we copy a whole package into vllm, so I'm creating a separate directory vllm/third_party to hold the code.
This RFC is for future reference, when we need to copy code into vllm/third_party.
Feedback Period.
No response
CC List.
No response
Any Other Things.
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.