Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] ngram spec #2886

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

XiaotongJiang
Copy link
Contributor

mvp passing unit test with torch backend

@yukavio
Copy link
Collaborator

yukavio commented Jan 15, 2025

Could you please share some performance results?

@merrymercy
Copy link
Contributor

  1. Can you compare with Speculative decoding with lookahead #2790?
  2. Can you use flashinfer backend?

@XiaotongJiang
Copy link
Contributor Author

  1. Can you compare with Speculative decoding with lookahead #2790?
  2. Can you use flashinfer backend?

I think it actually implemented a more general approach, this PR only support single branch strategy mentioned in the other PR. Ill close this one

@michaelfeil
Copy link
Contributor

@XiaotongJiang The other branch implemented lookahead decoding, yours is more close to tok-1 proposer n-gram prompt lookup decoding.
Both are interesting and they are pretty different. It would be useful to implement a more generic version of lookahead decoding that works with prompt lookup decoding.

@XiaotongJiang XiaotongJiang reopened this Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants