Baichuan2-7B-Chat model converted to ggml-model-q4_0.gguf, AI answer does not stop automatically when inference is made #5034

Lyzin · 2024-01-19T06:52:34Z

I converted the Baichuan2-7B-Chat model to ggml-model-q4_0.gguf and then used . /main inference and found that I could not stop answering

my system: macos
python version: 3.9.10

Here's my version using llama.cpp
. /main version：

llama.cpp git commit id:

Here are the steps I took to convert

python convert-hf-to-gguf.py ./models/Baichuan2-7B-Chat

./quantize ./models/Baichuan2-7B-Chat/ggml-model-f16.gguf ./models/Baichuan2-7B-Chat/ggml-model-q4_0.gguf q4_0

Here's how to reason with . /main reasoning results

./main -m ./models/Baichuan2-7B-Chat/ggml-model-q4_0.gguf -n 256 --repeat_penalty 1.0 -ngl 0 --color -i -r "User:" -f prompts/chat-with-baichuan.txt

Here is the result of the inference, which keeps repeating the output

Is this a problem caused by the conversion?

The text was updated successfully, but these errors were encountered:

hiepxanh · 2024-01-20T10:18:17Z

@Lyzin I belive it is the same issue with this #3969
can you download Phi-2 model and confirm the bug still happen?

Lyzin · 2024-01-25T17:14:45Z

@Lyzin I belive it is the same issue with this #3969 can you download Phi-2 model and confirm the bug still happen?

I downloaded the phi-2 model, requantized it, and the output automatically stopped
llama.cpp version tag: b1966

./main -m ./models/phi-2/ggml-model-q4_0.gguf -n 512 --color -i -cml -ngl 0 -r "User:" -f prompts/chat-with-bob.txt

Here is the output of the AI

github-actions · 2024-03-18T01:33:21Z

This issue is stale because it has been open for 30 days with no activity.

github-actions · 2024-04-03T01:13:34Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

Lyzin added the bug-unconfirmed label Jan 19, 2024

Lyzin changed the title ~~baichuan2-7B-Chat model converted to ggml-model-q4_0.gguf, AI answer does not stop automatically when inference is made~~ Baichuan2-7B-Chat model converted to ggml-model-q4_0.gguf, AI answer does not stop automatically when inference is made Jan 19, 2024

github-actions bot added the stale label Mar 18, 2024

github-actions bot closed this as completed Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baichuan2-7B-Chat model converted to ggml-model-q4_0.gguf, AI answer does not stop automatically when inference is made #5034

Baichuan2-7B-Chat model converted to ggml-model-q4_0.gguf, AI answer does not stop automatically when inference is made #5034

Lyzin commented Jan 19, 2024 •

edited

Loading

hiepxanh commented Jan 20, 2024

Lyzin commented Jan 25, 2024 •

edited

Loading

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

Baichuan2-7B-Chat model converted to ggml-model-q4_0.gguf, AI answer does not stop automatically when inference is made #5034

Baichuan2-7B-Chat model converted to ggml-model-q4_0.gguf, AI answer does not stop automatically when inference is made #5034

Comments

Lyzin commented Jan 19, 2024 • edited Loading

hiepxanh commented Jan 20, 2024

Lyzin commented Jan 25, 2024 • edited Loading

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

Lyzin commented Jan 19, 2024 •

edited

Loading

Lyzin commented Jan 25, 2024 •

edited

Loading