-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Spec Decode][CI] Add e2e test for examples/spec_decode.py and prevent breaking Acceptance Length
#24531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an end-to-end test for the spec_decode.py example script, integrating it into the CI pipeline to safeguard against regressions in speculative decoding acceptance length. The approach of refactoring the script for testability and adding assertions for a fixed test case is sound. My review focuses on improving the robustness and maintainability of these new tests. I've identified a missing assertion for a critical test parameter and a formatting issue in an assertion message that could hinder debugging. Addressing these points will make the new test more reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job for maintaining this! I think it is worthwhile to maintain the example scripts valid through CI, since they may be the places people get started from.
Link to #22992 for visibility.
examples/spec_decode.py and prevent breaking Acceptance Lengthexamples/spec_decode.py and prevent breaking Acceptance Length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me if this should be EAGLE1 or EAGLE3 or both, but in any case this is good to have.
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Head branch was pushed to by a user without write access
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io>
…ent breaking Acceptance Length (#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: yewentao256 <zhyanwentao@126.com>
…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: gaojc <1055866782@qq.com>
…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io>
…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io>
…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
In the past, the
examples/offline_inference/spec_decode.pyhas broken a few times due to changes in datasets or other places like this. This script is an important one since this allows measuring AL for SD methods.This PR adds this script to CI and ensures 2 things
Testing
cmd
time python3 examples/offline_inference/spec_decode.py --test --method eagle --num_spec_tokens 3 --dataset-name hf --dataset-path philschmid/mt-bench --num-prompts 80 --temp 0 --top-p 1.0 --top-k -1 --tp 1 --enable-chunked-prefillOutput