Skip to content

Conversation

@luccafong
Copy link
Collaborator

@luccafong luccafong commented Feb 25, 2025

No description provided.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify
Copy link

mergify bot commented Feb 25, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @luccafong.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Lu Fang <fanglu@fb.com>
@benchislett
Copy link
Collaborator

Hi @luccafong , could you take a look at #13626 ? I would appreciate if you could highlight any key differences between this implementation and the existing PR.

@luccafong
Copy link
Collaborator Author

Hi @luccafong , could you take a look at #13626 ? I would appreciate if you could highlight any key differences between this implementation and the existing PR.

Hi @benchislett , so I think #13626 is targeting on using EAGLE style to run forward on the same module, while this PR tartgeting on prediction for running forward on k MTP modules (k > 1) as described by the paper, so they are quite different

@benchislett
Copy link
Collaborator

I hope that we can orchestrate compatibility between these two features in the future so that either one is possible. I think there is a lot of overlap between the contributions of each feature.

@luccafong
Copy link
Collaborator Author

cc @LiuXiaoxuanPKU for early review. will need more cleanup and benchmark for publishing


# Prepare inputs for the next step
if step != num_steps - 1:
if step != num_steps - 1 and not self.mtp:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the multi-step logic omitted here, and self.mtp is just using TP1DraftModelRunner in is_fallback mode?

outputs.append(output)

if model_input.attn_metadata.num_prefills == 0 \
if not self.mtp and model_input.attn_metadata.num_prefills == 0 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this block skipped?

@mergify
Copy link

mergify bot commented Mar 27, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @luccafong.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added needs-rebase and removed tpu Related to Google TPUs labels Mar 27, 2025
@mergify mergify bot added tpu Related to Google TPUs and removed tpu Related to Google TPUs labels Apr 9, 2025
@mergify mergify bot added the deepseek Related to DeepSeek models label Jul 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants