Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lookahead Decoding Development Roadmap #13

Open
2 of 9 tasks
Viol2000 opened this issue Nov 24, 2023 · 6 comments
Open
2 of 9 tasks

Lookahead Decoding Development Roadmap #13

Viol2000 opened this issue Nov 24, 2023 · 6 comments

Comments

@Viol2000
Copy link
Collaborator

Viol2000 commented Nov 24, 2023

Software Quality

Implementation

  • Support FlashAttention
  • Support Sampling
  • Support Batch>1
  • Lookahead window KV-Cache (May hurt accuracy)
  • Verification branch trie

New Models

@qspang
Copy link

qspang commented Jan 16, 2024

Does this project support the vicuna model?

@Viol2000
Copy link
Collaborator Author

Does this project support the vicuna model?

Vicuna is already supported because it is based on LlamaForCausalLM.

@qspang
Copy link

qspang commented Jan 16, 2024

Thank you for your reply!Do you mean that I can use the following code to observe the acceleration effect of the vicuna model?
USE_LADE=1 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, with lookahead
USE_LADE=0 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, without lookahead

@Viol2000
Copy link
Collaborator Author

Thank you for your reply!Do you mean that I can use the following code to observe the acceleration effect of the vicuna model? USE_LADE=1 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, with lookahead USE_LADE=0 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, without lookahead

It should be lmsys/vicuna-7b-v1.3 and yes.

@qspang
Copy link

qspang commented Jan 16, 2024

Got it!Thank you for your reply again!

@LiweiPE
Copy link

LiweiPE commented Mar 6, 2024

Hi, Im really interesting in this decoding development. Is there any progress to integrate in Qwen model? thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants