[Question] Slow Speed of vLLM when evaluating MMLU #35

cby-pku · 2024-08-05T18:48:23Z

Required prerequisites

I have read the documentation https://align-anything.readthedocs.io.
I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
Consider asking first in a Discussion.

Questions

When evaluating MMLU, the codebase supports vLLM inference, but the speed is slow (20 minutes for a single task). According to my experience, the normal speed is 20 minutes for all tasks.

Kass123777 · 2024-08-05T23:33:37Z

Thank you for your question!
This is a known issue. Since the current architecture implements the BaseInference class based on deepspeed and vllm in the same Python file, importing deepspeed-related dependencies causes vllm to fail to start properly. Therefore, I set distributed_executor_backend="ray" when starting vllm. This does significantly affect efficiency.
We will further modify the framework in the next version to completely decouple the two backends and fully unleash the inference speed of vllm.

cby-pku added the question Further information is requested label Aug 5, 2024

cby-pku mentioned this issue Aug 5, 2024

feat: reconstruct evaluation framework #33

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Slow Speed of vLLM when evaluating MMLU #35

[Question] Slow Speed of vLLM when evaluating MMLU #35

cby-pku commented Aug 5, 2024

Kass123777 commented Aug 5, 2024

[Question] Slow Speed of vLLM when evaluating MMLU #35

[Question] Slow Speed of vLLM when evaluating MMLU #35

Comments

cby-pku commented Aug 5, 2024

Required prerequisites

Questions

Kass123777 commented Aug 5, 2024