-
Notifications
You must be signed in to change notification settings - Fork 544
[doc] fix feature support #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| |---------|-----------|------| | ||
| | Chunked Prefill | ✗ | Plan in 2025 Q1 | | ||
| | Automatic Prefix Caching | ✅ | Improve performance in 2025 Q1 | | ||
| | Automatic Prefix Caching | ✅ | Improve performance in 2025 Q2 (Not supported in release version) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only add doc for main in main branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx! done.
| | Best of | ✅ || | ||
| | Beam search | ✅ || | ||
| | Guided Decoding | ✗ | Plan in 2025 Q1 | | ||
| | Tensor Parallel | ✅ | Only "mp" in main ("ray" and "mp" in release version) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: MengqingCao <cmq0113@163.com>
a9c93a6 to
25ec3a7
Compare
Check and update the feature support table. - both multi-step and speculative decoding require adaptation of corresponding workers - prompt adapter (finetune method) require adaption in worker.py and model_runner.py Signed-off-by: MengqingCao <cmq0113@163.com>
add warm up & batch add
see #60