Command R+ outputs gibberish when used with text-generation-webui #6596

christiandaley · 2024-04-10T22:07:09Z

Not sure if this is the appropriate place to file this issue, but I don't have much idea what's going wrong and the exllamav2 quants of Command R+ work fine with text-generation-webui after updating to the latest exllamav2.

Manually installing and building llama-cpp-python with the latest llama.cpp allows text-generation-webui to load gguf quants of Command R+, but the output is always gibberish regardless of what sampler settings are used. From my understanding llama-cpp-python is just bindings for llama.cpp so I'm not sure if the issue could be there. The log output from text-generation-webui looks totally normal when loading and running inference.

I've tried the q5_k_s quant from here: https://huggingface.co/dranger003/c4ai-command-r-plus-iMat.GGUF/tree/main

As well as the q5_k_m quant from here: https://huggingface.co/mradermacher/c4ai-command-r-plus-GGUF/tree/main.

Both behave in the same way. The output seems somewhat related to the prompt and sometimes contains coherent sentences, but there is clearly something very wrong. Even when turning the temperature way down and raising min_p the output is for the most part nonsense. I can use a wide range of sampler settings with exllamav2 and get good results.

The text was updated successfully, but these errors were encountered:

henk717 · 2024-04-10T23:30:28Z

Koboldcpp has been running it fine so far, but in our case the default settings have shown the model is very sensitive to repetition penalty. So make sure that is low enough.

Considering our Command-R implementation is inhereted from Llamacpp i'd assume this is not an issue related to llamacpp.

christiandaley · 2024-04-11T01:01:26Z

Koboldcpp has been running it fine so far, but in our case the default settings have shown the model is very sensitive to repetition penalty. So make sure that is low enough.

Considering our Command-R implementation is inhereted from Llamacpp i'd assume this is not an issue related to llamacpp.

I've turned repetition penalty off, so that's not the issue. I can make a issue on text-generation-webui instead if that would be a better place.

schmorp · 2024-04-11T01:11:47Z

As an example, with IQ3_M, default settings (i..e no repetition penatly) and a prompt of "Hi,", I get this output:

Hi, I’ in the 10th grade, and I was just wondering, is it a good idea to take 3 APs(AP US, AP Calc, AP Physics) as a 10th grader, or should I just take 2, and save 1 for 11th, so I can take 2 in 11th, and 2 in 12th?

I am currently a 10th-11th grader, and I would say that it depends on your 1) schedule, 2) ability to manage time, 3) (your) school's/s's) course(s) and 4) how many you plan to take in 11th and 12th.

(1) If you have a busy schedule, and you are not sure whether you will have time to study, I would not recommend it. (2) If you are not able to manage your time, you will be in a mess. (3) If the 3 APs are in your 2-3 "best" subjects, then you should be good. (4) If you plan to take 1-2, 1-2, , 1-2, or -1, 2-2, 2-1, 2-2, -1, 2-2, 2-3, , 2-1, 3-2, 3-3, , 2-2, 3-3, 3-4, 3-4, , 2-3, 3-4, 3-5, 5-5, 5-5, 4-4, 5-4, 5-5, 5-5, 4-4, 4-4, 3-3, 3-2, 3-1, 3-2, 3-3, 3-4, 3-4, 3-3, 4-3, 4-3, 4-3, 4-3, 4-3, 3-3, 3-3, -3, 3-3, -3, 3-3, 3-3, -3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3 all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all 28 5 0 0 0 1 2 4 2 円 1 3 5 0 0 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 renditrenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditendit 4 0 0 renditrenditenditenditenditenditenditenditenditendit 4 0 0 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 renditrenditendit 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 rendit 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 1 1 1 2 4 0 0 1 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0

Jeximo · 2024-04-11T01:58:03Z

turned repetition penalty off, so that's not the issue. I can make a issue on text-generation-webui instead if that would be a better place.

I think the best way to figure out if llama.cpp is the cause of the issue is to run Cmd R+ with server (without python bindings, or text-generation-webui), and check the output. Here's an example, see this comment: #6551 (comment)

It's probably related to Cmd R+ prompt template instead of sampling. If using text-generation-webui, then maybe make an issue there.

christiandaley · 2024-04-11T17:06:20Z

As an example, with IQ3_M, default settings (i..e no repetition penatly) and a prompt of "Hi,", I get this output:

Hi, I’ in the 10th grade, and I was just wondering, is it a good idea to take 3 APs(AP US, AP Calc, AP Physics) as a 10th grader, or should I just take 2, and save 1 for 11th, so I can take 2 in 11th, and 2 in 12th?

I am currently a 10th-11th grader, and I would say that it depends on your 1) schedule, 2) ability to manage time, 3) (your) school's/s's) course(s) and 4) how many you plan to take in 11th and 12th.

(1) If you have a busy schedule, and you are not sure whether you will have time to study, I would not recommend it. (2) If you are not able to manage your time, you will be in a mess. (3) If the 3 APs are in your 2-3 "best" subjects, then you should be good. (4) If you plan to take 1-2, 1-2, , 1-2, or -1, 2-2, 2-1, 2-2, -1, 2-2, 2-3, , 2-1, 3-2, 3-3, , 2-2, 3-3, 3-4, 3-4, , 2-3, 3-4, 3-5, 5-5, 5-5, 4-4, 5-4, 5-5, 5-5, 4-4, 4-4, 3-3, 3-2, 3-1, 3-2, 3-3, 3-4, 3-4, 3-3, 4-3, 4-3, 4-3, 4-3, 4-3, 3-3, 3-3, -3, 3-3, -3, 3-3, 3-3, -3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3-3, 3 all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all all 28 5 0 0 0 1 2 4 2 円 1 3 5 0 0 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 renditrenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditenditendit 4 0 0 renditrenditenditenditenditenditenditenditenditendit 4 0 0 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 renditrenditendit 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 rendit 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 1 1 1 2 4 0 0 1 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0 0 1 1 1 1 2 4 0

This is very similar to what I've been seeing, where it starts off with a semi coherent sentence and then just devolves into rambling. The exl2 quants don't do this, they remain coherent.

FNsi · 2024-04-12T04:31:21Z

😂that remind me how mpt 7b inference worked in very beginning, turns out there was something wrong.

In mpt case it was the wrong ne[] calculation

Didn't look inside but I guess it's almost the same.

satyaloka93 · 2024-04-13T13:54:19Z

turned repetition penalty off, so that's not the issue. I can make a issue on text-generation-webui instead if that would be a better place.

I think the best way to figure out if llama.cpp is the cause of the issue is to run Cmd R+ with server (without python bindings, or text-generation-webui), and check the output. Here's an example, see this comment: #6551 (comment)

It's probably related to Cmd R+ prompt template instead of sampling. If using text-generation-webui, then maybe make an issue there.

Can you run server using openai calls to the endpoint? I'm trying that but not sure how to implement the template, it's clear sending through the traditional 'role':'user' doesn't work, and it fails to follow instructions.

Jeximo · 2024-04-13T16:33:42Z

Can you run server using openai calls to the endpoint?

Sure, once it's implemented: #6650

It needs to go through the test process to eliminate errors.

satyaloka93 · 2024-04-14T15:06:24Z

Can you run server using openai calls to the endpoint?

Sure, once it's implemented: #6650

It needs to go through the test process to eliminate errors.

Thanks, I found that right after I commented. I pulled and built that PR, at first I thought it was broken but then discovered I couldn't run command-r (IQ_4_XS) on my 4090 with my normal 8k context. It works fine at 2k.

github-actions · 2024-06-02T01:16:16Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

christiandaley added the bug-unconfirmed label Apr 10, 2024

github-actions bot added the stale label May 18, 2024

github-actions bot closed this as completed Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Command R+ outputs gibberish when used with text-generation-webui #6596

Command R+ outputs gibberish when used with text-generation-webui #6596

christiandaley commented Apr 10, 2024

henk717 commented Apr 10, 2024

christiandaley commented Apr 11, 2024

schmorp commented Apr 11, 2024

Jeximo commented Apr 11, 2024

christiandaley commented Apr 11, 2024

FNsi commented Apr 12, 2024 •

edited

Loading

satyaloka93 commented Apr 13, 2024

Jeximo commented Apr 13, 2024

satyaloka93 commented Apr 14, 2024

github-actions bot commented Jun 2, 2024

Command R+ outputs gibberish when used with text-generation-webui #6596

Command R+ outputs gibberish when used with text-generation-webui #6596

Comments

christiandaley commented Apr 10, 2024

henk717 commented Apr 10, 2024

christiandaley commented Apr 11, 2024

schmorp commented Apr 11, 2024

Jeximo commented Apr 11, 2024

christiandaley commented Apr 11, 2024

FNsi commented Apr 12, 2024 • edited Loading

satyaloka93 commented Apr 13, 2024

Jeximo commented Apr 13, 2024

satyaloka93 commented Apr 14, 2024

github-actions bot commented Jun 2, 2024

FNsi commented Apr 12, 2024 •

edited

Loading