Skip to content

[Performance]: 在910b显卡上使用0.9.0rc2镜像部署lora模型时速度很慢 #1686

@xuhengzzzy

Description

@xuhengzzzy

Proposal to improve performance

使用0.9.0rc2镜像部署lora模型时速度很慢,只有5tokens/s左右,只部署基模型时速度却可以达到30tokens/s
这个是只部署基模型时的测试速度

Image 这个是部署lora之后的测试速度 Image

Report of performance regression

No response

Misc discussion on performance

No response

Your current environment (if you think it is necessary)

The output of `python collect_env.py`

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions