Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation on how to do incremental builds #2796

Merged
merged 7 commits into from
Feb 7, 2024

Conversation

pcmoritz
Copy link
Collaborator

@pcmoritz pcmoritz commented Feb 7, 2024

On my system, this reduces compilation time from 2:40 min to 0:30 min, therefore enabling faster development iterations.

@pcmoritz
Copy link
Collaborator Author

pcmoritz commented Feb 7, 2024

One thing I noticed is the old python setup.py develop is also doing fast incremental compilation, but unfortunately that's deprecated and probably most people don't use it any more (e.g. it prints a big warning banner that it shouldn't be used any more and links to pypa/setuptools#917). It also gives 30s for the build.

The reason why python setup.py develop is fast is that it uses a stable temporary directory for the build, and it also uses the system torch path by default and doesn't copy the files into an unstable temporary directory like pip install -e . does.

Instead of doing something like in the PR, we could encourage people to use python setup.py develop, e.g. by leaving a comment about incremental compilation in setup.py, or by including it in the docs.

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feature! Left a small comment

setup.py Outdated
@@ -22,6 +22,9 @@
ROCM_SUPPORTED_ARCHS = {"gfx90a", "gfx942"}
# SUPPORTED_ARCHS = NVIDIA_SUPPORTED_ARCHS.union(ROCM_SUPPORTED_ARCHS)

if "VLLM_INCREMENTAL_BUILD_TORCH_PATH" in os.environ:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some document on how to use this? Setting the env var to

VLLM_INCREMENTAL_BUILD_TORCH_PATH=`python -c "import torch; print(torch.__path__[0])"`

is confusing to other users.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added the docs, let me know what you think :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried to do VLLM_INCREMENTAL_BUILD=1 to make it easier btw, but I couldn't figure out how to get the global torch path from within setup.py -- I tried both via python utilities and using subprocess (both with and without shell), but I think there is some environment variable set which points the package dir to the temporary directory. It is probably best not to try to hack around this further :)

@pcmoritz
Copy link
Collaborator Author

pcmoritz commented Feb 7, 2024

I was thinking about this a little more, maybe it is best to just document ’python setup.py develop' since it has even less overhead. The other approach uses an internal variable in torch so not clear it is better. I'll make that change.

@pcmoritz pcmoritz changed the title Optionally enable incremental build Add documentation on how to do incremental builds Feb 7, 2024
Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@zhuohan123 zhuohan123 merged commit 931746b into vllm-project:main Feb 7, 2024
17 checks passed
yhu422 added a commit to yhu422/vllm that referenced this pull request Feb 13, 2024
[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support  (vllm-project#2790)

Add documentation on how to do incremental builds (vllm-project#2796)

[Ray] Integration compiled DAG off by default (vllm-project#2471)

Disable custom all reduce by default (vllm-project#2808)

add usage context

removed usage_context from Engine_args

Move IO to another process

added http request

[ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (vllm-project#2768)

Add documentation section about LoRA (vllm-project#2834)

Refactor 2 awq gemm kernels into m16nXk32 (vllm-project#2723)

Co-authored-by: Chunan Zeng <chunanzeng@Chunans-Air.attlocal.net>

Added additional arg for from_engine_args

comments
alexm-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Feb 13, 2024
jvmncs pushed a commit to jvmncs/vllm that referenced this pull request Feb 14, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 20, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants