[Bad Case]: sglang运行MiniCPM3-4B性能不太理想 #266

wyy007 · 2024-11-27T09:03:16Z

Description / 描述

测试环境说明：系统：Ubuntu 22.04.4 LTS 5.15.0-122-generic 驱动：rocm6.1.2 CPU：16核 Intel(R) Xeon(R) w5-3435X 内存：256G 4800 MT/s
AMD 7900XTX FP16：123TFLOPS显存：24G显存带宽：964G（满血）

Case Explaination / 案例解释

模型分别使用MiniCPM3-4B、MiniCPM3-4B-GPTQ-Int4、Qwen2.5-7B-Instruct-GPTQ-Int8，模型启动命令如下：
MiniCPM3-4B：HIP_VISIBLE_DEVICES=1 python3 -m sglang.launch_server --model-path /root/.cache/modelscope/MiniCPM3-4B --port 30000 --mem-fraction-static 0.8 --kv-cache-dtype auto --attention-backend triton --sampling-backend pytorch --trust-remote-code --schedule-conservativeness 0.1 --disable-cuda-graph --enable-torch-compile

MiniCPM3-4B-GPTQ-Int4：HIP_VISIBLE_DEVICES=1 python3 -m sglang.launch_server --model-path /root/.cache/modelscope/MiniCPM3-4B-GPTQ-Int4 --port 30000 --mem-fraction-static 0.8 --kv-cache-dtype auto --attention-backend triton --sampling-backend pytorch --trust-remote-code --schedule-conservativeness 0.1 --disable-cuda-graph --enable-torch-compile --quantization gptq --disable-mla

Qwen2.5-7B-Instruct-GPTQ-Int8：HIP_VISIBLE_DEVICES=1 python3 -m sglang.launch_server --model-path /root/.cache/modelscope/Qwen2.5-7B-Instruct-GPTQ-Int8 --port 30000 --mem-fraction-static 0.8 --kv-cache-dtype auto --attention-backend triton --sampling-backend pytorch --trust-remote-code --schedule-conservativeness 0.1 --disable-cuda-graph --enable-torch-compile --quantization gptq --disable-mla

从性能结果上看，MiniCPM3-4B要差于Qwen2.5-7B-Instruct-GPTQ-Int8，能帮忙看看是哪里的问题吗？

wyy007 added the badcase Bad cases label Nov 27, 2024

wyy007 changed the title ~~[Bad Case]: sglang运行MiniCPM3-4B性能不太~~ [Bad Case]: sglang运行MiniCPM3-4B性能不太理想 Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bad Case]: sglang运行MiniCPM3-4B性能不太理想 #266

[Bad Case]: sglang运行MiniCPM3-4B性能不太理想 #266

wyy007 commented Nov 27, 2024

[Bad Case]: sglang运行MiniCPM3-4B性能不太理想 #266

[Bad Case]: sglang运行MiniCPM3-4B性能不太理想 #266

Comments

wyy007 commented Nov 27, 2024

Description / 描述

Case Explaination / 案例解释