You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.
root@cs:/home# ./qwen.cpp/build/bin/main -m qwen72b-ggml.bin --tiktoken qwen-72b-raw/qwen.tiktoken -i
ggml_init_cublas: found 2 CUDA devices:
Device 0: NVIDIA A800 80GB PCIe, compute capability 8.0
Device 1: NVIDIA A800 80GB PCIe, compute capability 8.0
CUDA error 2 at /home/qwen.cpp/third_party/ggml/src/ggml-cuda.cu:7196: out of memory
current device: 0
以上是报错信息,运行量化后的72b模型,不到40G的模型文件。一张卡80G不够,然后用两张卡,另外一个卡还没利用上就报错了。
有没有大佬指点一下?
The text was updated successfully, but these errors were encountered: