-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Fix speculative decode seeded test #6743
Conversation
Co-authored-by: Thomas Parnell <tpa@zurich.ibm.com>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge). To run full CI, you can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Have confirmed that either of the following changes causes the test to fail:
diff --git a/vllm/spec_decode/spec_decode_worker.py b/vllm/spec_decode/spec_decode_worker.py
index 8cf0aa5b..17e54a31 100644
--- a/vllm/spec_decode/spec_decode_worker.py
+++ b/vllm/spec_decode/spec_decode_worker.py
@@ -598,11 +598,14 @@ class SpecDecodeWorker(LoraNotSupportedWorkerBase):
# Get sequence group state
generators = []
for seq_group_metadata in seq_group_metadata_list:
+ '''
if (seq_group_metadata.state is not None
and seq_group_metadata.state.generator is not None):
generators.append(seq_group_metadata.state.generator)
else:
generators.append(None)
+ '''
+ generators.append(None)
or
diff --git a/vllm/spec_decode/batch_expansion.py b/vllm/spec_decode/batch_expansion.py
index 41f0aebf..df160134 100644
--- a/vllm/spec_decode/batch_expansion.py
+++ b/vllm/spec_decode/batch_expansion.py
@@ -293,6 +293,7 @@ class BatchExpansionTop1Scorer(SpeculativeScorer):
for data in new_seq_data_dict.values():
data.update_num_computed_tokens(data.get_len() - 1)
+ '''
if (seq_group_metadata.state is not None
and seq_group_metadata.state.generator is not None):
generator = torch.Generator(
@@ -301,6 +302,8 @@ class BatchExpansionTop1Scorer(SpeculativeScorer):
state = SequenceGroupState(generator=generator)
else:
state = None
+ '''
+ state = None
This is not the case on main
currently.
Signed-off-by: Alvant <alvasian@yandex.ru>
This is a simpler fix for the bug explained in #6733 where we were not properly testing the per-request seeds int he spec decoding case, @tdoublep and I had fixed it at the same time.