Support `--gpu-layers` #45

lindeer · 2023-11-22T03:22:43Z

问题：

build/bin/main -m /app/ecr/models/qwen-7b-ggml/qwen7b-ggml.bin --tiktoken /app/ecr/models/qwen-7b-ggml/qwen.tiktoken -t 6 -i
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce GTX 1660, compute capability 7.5
Welcome to Qwen.cpp! Ask whatever you want. Type 'clear' to clear context. Type 'stop' to exit.

Prompt > 三国演义都有哪些人物？ 

CUDA error 2 at /home/ecr/projects/qwen.cpp/third_party/ggml/src/ggml-cuda.cu:7310: out of memory
current device: 0

可以和llama.cpp一样设定--gpu-layers的值，这样可以运行在显存不够的设备上

The text was updated successfully, but these errors were encountered:

fann1993814 · 2023-11-24T00:47:18Z

@lindeer
不確定這個 PR 對你來說是否有用？因為這個只實驗在 Apple Metal 上面。
但也許你可以試試看？ #41

lindeer · 2024-01-03T02:18:54Z

一样的问题 #55

lindeer · 2024-01-03T03:45:35Z

合并到llama.cpp 已解决

cl886699 · 2024-01-25T08:10:50Z

合并到llama.cpp 已解决

你好，怎么合并的呢，我使用llama.cpp推理qwen输出总是会出现问题，该仓库又不知道怎么使用多gpu

lindeer · 2024-01-26T09:11:20Z

@cl886699 出现什么问题？一般显存不够才需要外部传入一个gpu-layers值，使用这个参数是针对GPU情况而言的，需要把llama.cpp编译成支持GPU的库

cl886699 · 2024-01-29T04:03:18Z

@cl886699 出现什么问题？一般显存不够才需要外部传入一个gpu-layers值，使用这个参数是针对GPU情况而言的，需要把llama.cpp编译成支持GPU的库

我用的1954版本的llama.cpp，用convert-hf-to-gguf.py转换的模型，推理时出现这样的问题，Yi模型就能正常推理。

lindeer · 2024-02-02T08:57:41Z

直接在hf上下载已经转换好的模型吧，你这个中间环节太多了，没法确定是啥问题，我目测是你编译llama.cpp的时候没带上LLAMA_CUBLAS=on，只有带上这个编译选项才能编出支持GPU运行的二进制

lindeer closed this as completed Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `--gpu-layers` #45

Support `--gpu-layers` #45

lindeer commented Nov 22, 2023

fann1993814 commented Nov 24, 2023

lindeer commented Jan 3, 2024

lindeer commented Jan 3, 2024

cl886699 commented Jan 25, 2024

lindeer commented Jan 26, 2024

cl886699 commented Jan 29, 2024

lindeer commented Feb 2, 2024

Support --gpu-layers #45

Support --gpu-layers #45

Comments

lindeer commented Nov 22, 2023

fann1993814 commented Nov 24, 2023

lindeer commented Jan 3, 2024

lindeer commented Jan 3, 2024

cl886699 commented Jan 25, 2024

lindeer commented Jan 26, 2024

cl886699 commented Jan 29, 2024

lindeer commented Feb 2, 2024

Support `--gpu-layers` #45

Support `--gpu-layers` #45