Skip to content

fix log info #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion lmdeploy/infer_compare.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,6 @@ def gen_transformers(query: str, tokenizer, model) -> str:
cost = (end - start)
throughput = round(count / cost)

print(f"{engine} cost {cost:.2f}s {throughput} tokens/s")
print(f"{engine} 耗时 {cost:.2f} {throughput} 字/秒")


10 changes: 8 additions & 2 deletions lmdeploy/lmdeploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,14 +146,20 @@ lmdeploy 支持直接读取 Huggingface 模型权重,目前共支持三种类
示例如下:

```bash
# 需要能访问 Huggingface 的网络环境
lmdeploy chat turbomind internlm/internlm-chat-20b-4bit --model-name internlm-chat-20b

lmdeploy chat turbomind Qwen/Qwen-7B-Chat --model-name qwen-7b
```

上面两行命令分别展示了如何直接加载 Huggingface 的模型,第一条命令是加载使用 lmdeploy 量化的版本,第二条命令是加载其他 LLM 模型。

以上命令会启动一个本地对话界面,通过 Bash 可以与 LLM 进行对话。
我们也可以直接启动本地的 Huggingface 模型,如下所示。

```bash
lmdeploy chat turbomind /share/temp/model_repos/internlm-chat-7b/ --model-name internlm-chat-7b
```

以上命令都会启动一个本地对话界面,通过 Bash 可以与 LLM 进行对话。

#### 2.1.2 离线转换

Expand Down