-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Call for Help: Proper Build System (CMake, Bazel, etc). #2654
Comments
@rgommers would meson-python support this? It checks most of the boxes but not sure about the multiple hardware for accelerators. If not hoping you might have some experience and opinions you'd be willing to share. |
The question here isn't very clear to me, I'm missing context I guess. Reading all the requirements, it should like you need a regular build system (CMake or Meson are the most commonly used and best general-purpose options). However, if you're already using the PyTorch extension builder, it sounds like that is something you do on the fly (maybe exposed to end users?) - this is a very different use case. |
Ah good to clarify here. We are really just looking for a regular build system to replace current usage of Torch extension builders. |
@simon-mo, the team from Neural Magic is going to work on this |
Hi all, I wanted to give an update on this project. So far, I've got a CUDA build working (see PR #2830). The PR has a detailed description of the cmake system. I'm still working on the AMD/ROCm build which is a little trickier because of the "hipify" preprocessor that pytorch uses on the CUDA sources. |
Currently vLLM's compilation tool uses PyTorch's extension builders, which calls Ninja under the hood. This works okay but have the following issues:
We would liked to ask for community's help on recommending a technology, prototype, and implement it. Ideally something like CMake or Bazel could work but it requires some careful thinking.
The requirements:
MAX_JOBS
andNVCC_THREADS
to get around compiler goes out of memory. I think this is because nvcc spawn threads for each SM architecture we are compiling to.Currently, the "build system" is all in here https://github.com/vllm-project/vllm/blob/main/setup.py
The text was updated successfully, but these errors were encountered: