部署 CodeShell-7B-Chat 的硬件需求？ #56

toohandsome · 2023-11-21T02:23:09Z

我想在公司内部搭建一套 CodeShell-7B-Chat ，大概用户数量200~300 ，请问需要多大的内存和显卡？

ironmanlj · 2023-11-21T07:22:44Z

反正我在GPU上布了一个codeshell-7B-Chat，用的是V100，显存用了18-19个g, cpu没怎么用

qianma819 · 2023-11-23T01:47:27Z

反正我在GPU上布了一个codeshell-7B-Chat，用的是V100，显存用了18-19个g, cpu没怎么用
采用tgi进行部署，用的4070，买不起好的显卡。按照文档的参数
docker run --gpus 'all' --shm-size 1g -p 9090:80 -v $HOME/models:/data
--env LOG_LEVEL="info,text_generation_router=debug"
ghcr.nju.edu.cn/huggingface/text-generation-inference:1.0.3
--model-id /data/CodeShell-7B-Chat --num-shard 1
--max-total-tokens 5000 --max-input-length 4096
--max-stop-sequences 12 --trust-remote-code
运行错误ERROR shard-manager:text_generation_launcher:Shard complete standard error output:
1.是4070的12显存不够么？
2.这个要怎么配置host？要额外加参数么？CodeShell-7B-Chat-int4模型命令./server -m ./models/codeshell-chat-q4_0.gguf --host 127.0.0.1 --port 8080 一看就知道怎么配置

ironmanlj · 2023-11-23T05:33:43Z

1.12g显存跑6b模型不够，我试了一下显存至少16-18g吧。
2.因为它用的是docker部署，-p的那个参数就是映射端口，把容器的80端口映射到服务器的9090端口，至于内部为啥设置成80端口，应该是默认的。

qianma819 · 2023-11-23T06:09:21Z

1.12g显存跑6b模型不够，我试了一下显存至少16-18g吧。

看提示确实是gpu的显存不够。可以更改max_split_size_mb，但是我没搜到这个。cuda内存溢出，可以改小batchsize，这个batchsize是在哪改知道不？
2.docker部署的话，我用vs插件访问，那么需要配置服务器的ip。docker参数是否可以配置这个？

ironmanlj · 2023-11-24T05:53:19Z

1.12g显存跑6b模型不够，我试了一下显存至少16-18g吧。

看提示确实是gpu的显存不够。可以更改max_split_size_mb，但是我没搜到这个。cuda内存溢出，可以改小batchsize，这个batchsize是在哪改知道不？
2.docker部署的话，我用vs插件访问，那么需要配置服务器的ip。docker参数是否可以配置这个？

batchsize我不清楚，用vs插件访问，也可以用docker部署，改配置的时候就把ip改成你部模型的ip,端口就是你的映射端口，比如上面那个就是9090，就能访问了

wxfvf · 2023-11-28T13:01:31Z

用docker部署，24G直接崩了，6B模型怎么会用这么大的内存？

jump2 · 2023-12-01T06:05:45Z

我docker部署显存8G，内存16G跑不起来
2023-12-01T06:01:48.153557Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-01T06:01:58.169072Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-01T06:02:09.732462Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:

The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] rank=0
2023-12-01T06:02:09.732513Z ERROR shard-manager: text_generation_launcher: Shard process was signaled to shutdown with signal 9 rank=0
2023-12-01T06:02:09.832288Z ERROR text_generation_launcher: Shard 0 failed to start
2023-12-01T06:02:09.832412Z INFO text_generation_launcher: Shutting down shards
一直是这样，各位运行起来的都是多大的显存和内存的

MeJerry215 · 2023-12-12T02:14:08Z

用docker部署，24G直接崩了，6B模型怎么会用这么大的内存？

@wxfvf 你在加载模型的地方看看是加载float32的模型还是float16的模型 6B模型加载 fp16 x2 = 12G至少能加载 fp32 模型x4 = 24G至少能加载所以直接崩了内存，这个模型好像默认用fp32 我服了。

load的地方 torch_dtype=torch.float16 我改完之后内存降了，要么就是你推的时候太长的tokens？占用了过多的kv cache。

wxfvf · 2023-12-12T02:43:24Z

用docker部署，24G直接崩了，6B模型怎么会用这么大的内存？

@wxfvf 你在加载模型的地方看看是加载float32的模型还是float16的模型 6B模型加载 fp16 x2 = 12G至少能加载 fp32 模型x4 = 24G至少能加载所以直接崩了内存，这个模型好像默认用fp32 我服了。

load的地方 torch_dtype=torch.float16 我改完之后内存降了，要么就是你推的时候太长的tokens？占用了过多的kv cache。

一开始用vs插件上的官方参数直接跑不起来，添加了 --dtype bfloat16 后还是崩，又改了token长度 --max-total-tokens 4098 --max-input-length 2048 ，终于跑起来了，显存占了18、19G左右。

zpjmj · 2024-03-09T13:26:36Z

@wxfvf @MeJerry215
老哥，多卡怎么运行啊。单开按照你的参数成功了。用官方的参数--gpus 'all' 2张GPU直接就爆显存啦。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

部署 CodeShell-7B-Chat 的硬件需求？ #56

部署 CodeShell-7B-Chat 的硬件需求？ #56

toohandsome commented Nov 21, 2023

ironmanlj commented Nov 21, 2023

qianma819 commented Nov 23, 2023

ironmanlj commented Nov 23, 2023

qianma819 commented Nov 23, 2023

ironmanlj commented Nov 24, 2023

wxfvf commented Nov 28, 2023

jump2 commented Dec 1, 2023

MeJerry215 commented Dec 12, 2023 •

edited

Loading

wxfvf commented Dec 12, 2023

zpjmj commented Mar 9, 2024

部署 CodeShell-7B-Chat 的 硬件需求？ #56

部署 CodeShell-7B-Chat 的 硬件需求？ #56

Comments

toohandsome commented Nov 21, 2023

ironmanlj commented Nov 21, 2023

qianma819 commented Nov 23, 2023

ironmanlj commented Nov 23, 2023

qianma819 commented Nov 23, 2023

ironmanlj commented Nov 24, 2023

wxfvf commented Nov 28, 2023

jump2 commented Dec 1, 2023

MeJerry215 commented Dec 12, 2023 • edited Loading

wxfvf commented Dec 12, 2023

zpjmj commented Mar 9, 2024

部署 CodeShell-7B-Chat 的硬件需求？ #56

部署 CodeShell-7B-Chat 的硬件需求？ #56

MeJerry215 commented Dec 12, 2023 •

edited

Loading