-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Pipeline parallel support for Mixtral #6516
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Full CI run is still required to merge this PR so once the PR is ready to go, please make sure to run it. If you need all test signals in between PR commits, you can trigger full CI as well. To run full CI, you can do one of these:
🚀 |
f83603e
to
d74f2e6
Compare
Tested locally with PP=8 and worked. |
can you test the correctness locally, using https://github.com/vllm-project/vllm/blob/main/tests/distributed/test_pipeline_parallel.py ? |
Passed with the following configures. Note that I tested it on 8xL4 so I have to use 8 GPUs to host the model.
Also fixed some issues in the test file:
|
# Use the same number or at most 8 GPUs to hold the model. | ||
# In this test we assume the model can fit in 8 GPUs. | ||
str(min(TP_SIZE * PP_SIZE, 8)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not going to work. this will run in multi-node tests with mp backend, and we can use at most 2 GPUs.
you can revert this change, keep it only for your local testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted with comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if tests pass
Signed-off-by: Alvant <alvasian@yandex.ru>
Take from #6403. Co-authored by @binxuan
cc @youkaichao