-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple GPUS #47
Comments
Quick fix: https://github.com/hao-ai-lab/LookaheadDecoding/blob/main/applications/eval_mtbench.py#L511 in this line set DIST_WORKERS=0. I will do a refactor later to fix this thoroughly. |
It is now running normally. Thanks to the author for his timely reply! ! ! |
btw,do you have code for testing accuracy like the medusa project?https://github.com/FasterDecoding/Medusa/blob/v1.0-prerelease/medusa/eval/heads_accuracy.py |
I did not implement such a function, but it may not be too hard to compute accuracy by dividing the number of per-step accepted tokens by the number of per-step speculation tokens. |
When USE_LADE is set to 0, please set --use-pp=1 to use huggingface's pipeline parallelism, or you can use the deepspeed for tensor parallelism as the script I provided https://github.com/hao-ai-lab/LookaheadDecoding/blob/main/applications/run_mtbench.sh#L33 |
Fixed it!Thank you for your patient reply! If I use a smaller model such as llama2-7b-chat and set USE_LADE to 0, what will be the impact if --use-pp is also set to 0? |
I guess it will be put on a single gpu. Check this https://github.com/hao-ai-lab/LookaheadDecoding/blob/main/applications/eval_mtbench.py#L214 and https://github.com/hao-ai-lab/LookaheadDecoding/blob/main/applications/eval_mtbench.py#L250 for model placement configuration. |
ok,got it! |
When I tested llama2-70b on an A800 graphics card, I encountered the problem of insufficient video memory. I want to ask how I should write the command if I want to test on two A800 graphics cards?I tried this command again:
But an error was reported:
The text was updated successfully, but these errors were encountered: