-
Notifications
You must be signed in to change notification settings - Fork 545
[Doc] Add vllm-ascend usage doc & fix doc format #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
No need to update installation and quick start doc. They will be updated in new PR. |
ok. |
Signed-off-by: Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com>
Yikun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it has been greatly improved compared to the previous version, thank you!
docs/source/tutorials.md
Outdated
| ```bash | ||
| # Use Modelscope mirror to speed up model download | ||
| export VLLM_USE_MODELSCOPE=True | ||
| export MODELSCOPE_CACHE=/root/models/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| export MODELSCOPE_CACHE=/root/models/ |
you can use default cache -v /root/.cache:/root/.cache
docs/source/tutorials.md
Outdated
| -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ | ||
| -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \ | ||
| -v /etc/ascend_install.info:/etc/ascend_install.info \ | ||
| -v /root/models:/root/models \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -v /root/models:/root/models \ | |
| -v /root/.cache:/root/.cache \ |
docs/source/tutorials.md
Outdated
| -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ | ||
| -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \ | ||
| -v /etc/ascend_install.info:/etc/ascend_install.info \ | ||
| -v /root/models:/root/models \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -v /root/models:/root/models \ | |
| -v /root/.cache:/root/.cache \ |
docs/source/tutorials.md
Outdated
| -v /root/models:/root/models \ | ||
| -p 8000:8000 \ | ||
| -e VLLM_USE_MODELSCOPE=True \ | ||
| -e MODELSCOPE_CACHE=/root/models/ \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -e MODELSCOPE_CACHE=/root/models/ \ |
docs/source/tutorials.md
Outdated
| -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ | ||
| -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \ | ||
| -v /etc/ascend_install.info:/etc/ascend_install.info \ | ||
| -v /root/models:/root/models \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -v /root/models:/root/models \ | |
| -v /root/.cache:/root/.cache \ |
docs/source/tutorials.md
Outdated
| ```bash | ||
| # Use Modelscope mirror to speed up model download | ||
| export VLLM_USE_MODELSCOPE=True | ||
| export MODELSCOPE_CACHE=/root/models/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| export MODELSCOPE_CACHE=/root/models/ |
docs/source/tutorials.md
Outdated
| def clean_up(): | ||
| destroy_model_parallel() | ||
| destroy_distributed_environment() | ||
| gc.collect() | ||
| torch.npu.empty_cache() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a little bit wired, would you mind taking a look? @wangxiyuan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is only a simple example, no need to do
del llm
clean_up()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we using mp as distributed_executor_backend, the clean up must be done by hand, otherwise will raise error when exiting process. This is a bug in vLLM.
Signed-off-by: Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com>
|
After this PR merged, pls also backport this to v0.7.1 branch. |
### What this PR does / why we need it? 1. Add vllm-ascend tutorial doc for Qwen/Qwen2.5-7B-Instruct model serving doc 2. fix format of files in `docs` dir, e.g. format tables, add underline for links, add line feed... ### Does this PR introduce _any_ user-facing change? <!-- Note that it means *any* user-facing change including all aspects such as API, interface or other behavior changes. Documentation-only updates are not considered user-facing changes. --> no. ### How was this patch tested? doc CI passed --------- Signed-off-by: Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com>
fix bugs caused by variable name old_placemet
fix bugs caused by variable name old_placemet
What this PR does / why we need it?
docsdir, e.g. format tables, add underline for links, add line feed...Does this PR introduce any user-facing change?
no.
How was this patch tested?
no.