Crafting prompts to get LLaMA models to generate interesting content #156

paulocoutinhox · 2023-03-15T07:14:54Z

Hi,

Im getting a strange behaviour and answer:

./main -m ./models/7B/ggml-model-q4_0.bin -t 8 -n 256 --repeat_penalty 1.0 --color -p "User: how many wheels have a car?"
main: seed = 1678864388
llama_model_load: loading model from './models/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: ggml ctx size = 4529.34 MB
llama_model_load: memory_size =   512.00 MB, n_mem = 16384
llama_model_load: loading model part 1/1 from './models/7B/ggml-model-q4_0.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 

main: prompt: 'User: how many wheels have a car?'
main: number of tokens in prompt = 11
     1 -> ''
  2659 -> 'User'
 29901 -> ':'
   920 -> ' how'
  1784 -> ' many'
 18875 -> ' wheel'
 29879 -> 's'
   505 -> ' have'
   263 -> ' a'
  1559 -> ' car'
 29973 -> '?'

sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.000000


User: how many wheels have a car?
User: how many wheel have a car?
Weegy: A car has four wheels.
User: how many wheels have a car?
Weegy: A car has four wheels. It depends on what you mean by "how many."
User: A car has four wheels. How many wheels have a car?
Weegy: A car has four wheels.
A car has four wheels.
A car has four wheels. It depends on what you mean by "how many."
"A car has four wheels. How many wheels have a car?" [end of text]


main: mem per token = 14434244 bytes
main:     load time =  1940.71 ms
main:   sample time =   116.92 ms
main:  predict time =  7092.72 ms / 51.40 ms per token
main:    total time = 10812.94 ms

Answer:

User: how many wheels have a car?
User: how many wheel have a car?
Weegy: A car has four wheels.
User: how many wheels have a car?
Weegy: A car has four wheels. It depends on what you mean by "how many."
User: A car has four wheels. How many wheels have a car?
Weegy: A car has four wheels.
A car has four wheels.
A car has four wheels. It depends on what you mean by "how many."
"A car has four wheels. How many wheels have a car?" [end of text]

How i can get only one answer and a time?

There is a more precise model than 7B?

There is portuguese/brazil support in languages to question/answer?

The text was updated successfully, but these errors were encountered:

ssvenn · 2023-03-15T07:42:12Z

This is normal behavior try adding -i -r "User" to stop text generation and let you add your own text after it hits the reverse token. You probably need to give the model more context to get the desired order of output, try this:

./main -m ./models/7B/ggml-model-q4_0.bin -t 16 -n 2048 -i -r " User" --color -p "Transcript of a dialog, where the User interacts with an Assistant named Computer. Computer is honest, good at writing, and never fails to answer the User's requests immediately and with precision.

 User: How many wheels does a bike have?
 Computer: A bike has two wheels.
 User: How many wheels does a car have?"

User: How many wheels does a bike have?
Computer: A bike has two wheels.
User: How many wheels does a car have?
Computer: It depends on what type of car it is, but in general terms we could say that 95% of cars use four wheels and the remaining vehicles can be counted using your fingers from one to five (including your thumb). So you don't need more than three digits. However if I were asked how many windows does a house have? then my answer would always be "as few as possible".

Don't expect miracles from the 7B model. It has a good sense of humor though :)

paulocoutinhox · 2023-03-15T07:53:53Z

Im have some questions.

There is a way to create a model like the 7B to pass my catalog of books and make questions to my books for example? If yes, do you have any example?

terafo · 2023-03-15T08:00:24Z

You would need GPU with tens of gigs of VRAM and use another fork.

paulocoutinhox · 2023-03-15T08:04:35Z

When you say "use another fork" you mean that llama.cpp only works with Facebook LLAMA data and cannot train other dataset?

terafo · 2023-03-15T08:11:20Z

llama.cpp is made only for inference, it doesn't have training functionality. It wouldn't make sense to do that on CPU for model of that size anyways. Meta didn't release LLaMa training code, but, AFAIK, there is at least one alternative implementation of training code, you should use those.

G2G2G2G · 2023-03-15T08:49:31Z

nothing is strange about that input and the bot mimicking it and giving you output. Nothing at all, this is how all language models act.

I just answered this thread:
#122 (comment)
it should solve your issue and give you what you're trying to do, which is interact with the bot.. (you aren't even using that flag, either)
anyway, read that, do what I said, use his command (my edited one) too and it should give you a few questions /answers

as posted in that reply issue 71 makes this greatly less usable and until that issue is fixed, chat mode is basically unusable for more than like 2 questions.

also close threads if your issues are resolved.

When you say "use another fork" you mean that llama.cpp only works with Facebook LLAMA data and cannot train other dataset?

llama cpp is only for llama.. and written in C++, and only for CPU. and only for running the models.

There is a way to create a model like the 7B to pass my catalog of books and make questions to my books for example? If yes, do you have any example?

after issue 71 is fixed, you can do that sure. Write all the questions and answers to them that exist for your catalog. I suggest you take your time doing that now. The more questions and answers you have, the more exact it'll be.
For example: Stanford released 52,000 questions/answers. 260,000 lines of text in order to tell the language model what it wants and how it wants it to act, exactly.

paulocoutinhox · 2023-03-15T17:48:57Z

Nice.

So, if i understand, the llama it self don't have all contents indexed to we make questions, but instead it is trained to "understand" my contents using that input file (-f prompts.txt) that i can train it with my data and the "7B" data make it understand my content as sentences to be answered. Is this?

gjmulder · 2023-03-15T17:57:23Z

You can "prime the model" by engineering prompts for it to respond to. This is not training but nudging the model into generating a narrative that is relevant to your problem. One way I prime ChatGPT is by starting with:

Q: Describe the abilities of a rocket scientist
A: A rocket scientist builds

The model will then generate a description of what a rocket scientist does. It is primed to "think about rocket science". Then continue the rocket science narrative with the next question:

Q: How would you build a rocket to Mars?
A:

Now you get a rocket scientist's answer to how she would build a rocket to get to Mars.

Note that ChatGPT has been fine-tuned to follow your instructions. LLaMA has not, therefore you need to help it by prompting it with the sort of answers you desire.

G2G2G2G · 2023-03-15T17:58:40Z

Yea it's a language model that just predicts what comes next https://en.wikipedia.org/wiki/Language_model
like your phone keyboard does... but hopefully not as terrible

paulocoutinhox · 2023-03-15T18:23:41Z

Nice.

To a more real scenario, if i want input all the bible text into the LLAMA, how that .txt file need be created to it "understand" that context and i can make questions?

Example of bible data:
https://raw.githubusercontent.com/tushortz/variety-bible-text/master/bibles/kjv.txt

Since we cant training the LLAMA but can make reverse process inputing data on it how we can put the full King James Bible version on it ^?

And if i make a mobile app, for each question i will need load the .TXT with the "questions" to be inputed on LLAMA, correct? And after a field to the user make their question and capture the answer.

gjmulder · 2023-03-15T18:42:06Z

Be aware, it isn't going to search the Bible, it is instead generating potentially fictional Bible content. You need to carefully consider the ethical religious consequences of such an app.

With that caveat, prompt it with something like:

Q. What makes the King James version of the Bible different to other versions?
A. The King James version of the Bible is different because

Then:

Q: What are the major themes in the Old Testament?
A: The Old Testament covers the following themes

This might get it primed to narrating/generating in the style of the King James version and in the context of the Old Testament.

Then:

Q: What did Xanomander the Prophet say to the Israelites?
A: God is

And you might get some pseudo-prophet output about God from the imaginary prophet Xanomander.

I would prototype this with ChatGPT. Once you have a useful prompt which defines the King James Bible and Old Testament, try ChatGPT's definitions as the prompt for LLaMA.

In terms of integration, you'd hardcode your initial engineered prompt (i.e. ChatGPT's definitions) and then append the Q's you want to answer about the Bible.

paulocoutinhox · 2023-03-15T18:57:15Z

Hi,

I understand and as i sad, it si an experiment. It don't will "search in bible" today, but i can input the bible content to it learn about the bible verses and make questions about the bible verses?

One real example.

File: bible.txt

In the beginning God created the heaven and the earth. -- genesis 1:1
And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters. -- genesis 1:2
And God said, Let there be light: and there was light. -- genesis 1:3
User:

Execution:

./main -m ./models/7B/ggml-model-q4_0.bin -t 8 -n 1024 --repeat_penalty 1.0 --color -i -r "User:" -f 'bible.txt'
main: seed = 1678906552
llama_model_load: loading model from './models/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: ggml ctx size = 4529.34 MB
llama_model_load: memory_size =   512.00 MB, n_mem = 16384
llama_model_load: loading model part 1/1 from './models/7B/ggml-model-q4_0.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 

main: prompt: 'In the beginning God created the heaven and the earth. -- genesis 1:1
And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters. -- genesis 1:2
And God said, Let there be light: and there was light. -- genesis 1:3
User:'
main: number of tokens in prompt = 88
     1 -> ''
   797 -> 'In'
   278 -> ' the'
  6763 -> ' beginning'
  4177 -> ' God'
  2825 -> ' created'
   278 -> ' the'
 18356 -> ' heaven'
   322 -> ' and'
   278 -> ' the'
  8437 -> ' earth'
 29889 -> '.'
  1192 -> ' --'
 18530 -> ' gene'
  1039 -> 'si'
 29879 -> 's'
 29871 -> ' '
 29896 -> '1'
 29901 -> ':'
 29896 -> '1'
    13 -> '
'
  2855 -> 'And'
   278 -> ' the'
  8437 -> ' earth'
   471 -> ' was'
  1728 -> ' without'
   883 -> ' form'
 29892 -> ','
   322 -> ' and'
  1780 -> ' void'
 29936 -> ';'
   322 -> ' and'
 23490 -> ' darkness'
   471 -> ' was'
  2501 -> ' upon'
   278 -> ' the'
  3700 -> ' face'
   310 -> ' of'
   278 -> ' the'
  6483 -> ' deep'
 29889 -> '.'
  1126 -> ' And'
   278 -> ' the'
 20799 -> ' Spirit'
   310 -> ' of'
  4177 -> ' God'
  6153 -> ' moved'
  2501 -> ' upon'
   278 -> ' the'
  3700 -> ' face'
   310 -> ' of'
   278 -> ' the'
 19922 -> ' waters'
 29889 -> '.'
  1192 -> ' --'
 18530 -> ' gene'
  1039 -> 'si'
 29879 -> 's'
 29871 -> ' '
 29896 -> '1'
 29901 -> ':'
 29906 -> '2'
    13 -> '
'
  2855 -> 'And'
  4177 -> ' God'
  1497 -> ' said'
 29892 -> ','
  2803 -> ' Let'
   727 -> ' there'
   367 -> ' be'
  3578 -> ' light'
 29901 -> ':'
   322 -> ' and'
   727 -> ' there'
   471 -> ' was'
  3578 -> ' light'
 29889 -> '.'
  1192 -> ' --'
 18530 -> ' gene'
  1039 -> 'si'
 29879 -> 's'
 29871 -> ' '
 29896 -> '1'
 29901 -> ':'
 29941 -> '3'
    13 -> '
'
  2659 -> 'User'
 29901 -> ':'

main: interactive mode on.
main: reverse prompt: 'User:'
main: number of tokens in reverse prompt = 2
  2659 -> 'User'
 29901 -> ':'

sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.000000


== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - If you want to submit another line, end your input in '\'.
In the beginning God created the heaven and the earth. -- genesis 1:1
And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters. -- genesis 1:2
And God said, Let there be light: and there was light. -- genesis 1:3
User:who is the earth creator?
The earth is created by God.
User:in what verse the light was created?
Geneisi 1:3
User:who is god?
God is the creator of earth.
User:what is the light?
Light is the rays of sun
User:what god said about the light?
God said let there be light: and there was light
User:

As you can see, this is perfect.

paulocoutinhox · 2023-03-15T19:05:51Z

What i need now is understand how i can "trainining" the "bible.txt" to load it already trained instead of this reverse form.

gjmulder · 2023-03-15T19:06:46Z

You can just directly ask it questions about specific chapters of the bible. You can assume it knows the bible as it has read (i.e. been trained on) Gutenberg. Whether the answers are useful is another matter:

$ ./main -m ./models/30B/ggml-model-f16.bin --top_p 0.5 -t 16 -n 512 -p "Q: In Genesis in the Bible, what it the metaphorical meaning of the snake? A: The metaphorical meaning of the snake is" 2>/dev/null
Q: In Genesis in the Bible, what it the metaphorical meaning of the snake? A: The metaphorical meaning of the snake is that he was a snake.
The only thing worse than being talked about...is not being talked about. - Oscar Wilde (1854-1900)

paulocoutinhox · 2023-03-15T19:13:00Z

The bible is only an example man, i want understand it background with common data. People may want input their own content. I have some questions:

First: How i can "training" the "bible.txt" to load it already trained instead of this reverse form (or any other content).

Second: How can i put the prompt in the execution instead of use interactive mode?

Before:

./main -m ./models/7B/ggml-model-q4_0.bin -t 8 -n 1024 --repeat_penalty 1.0 --color -i -r "User:" -f 'bible.txt'

After:

./main -m ./models/7B/ggml-model-q4_0.bin -t 8 -n 1024 --repeat_penalty 1.0 --color ???

gjmulder · 2023-03-15T19:25:06Z

The bible is only an example man, i want understand it background with common data. People may want input their own content. I have some questions:

First: How i can "training" the "bible.txt" to load it already trained instead of this reverse form (or any other content).

The prompt is the only content you can provide. The rest is up to the knowledge already stored in the model (e.g. bibles or rockets). Either your user has to provide the prompt, or if you want to prime the model to discuss a specific topic you need to use the -p option to prompt the model.

I think you need to read more about how pre-trained LLMs work. Have you used ChatGPT?

Also, the command line interface you are using is not suited to direct integration into an app. Maybe wait until some python bindings are integrated unless you are familiar with programming in C++ and can reverse engineer main.cpp?

gjmulder · 2023-03-15T19:30:32Z

Closing this as the questions aren't really specific to llama.cpp.

levicki · 2023-03-16T02:10:12Z

like your phone keyboard does... but hopefully not as terrible

I'd say it's worse than the phone.

It should be clarified that this specific model is geared towards generating (continuation of a) content, not towards chat or towards adventure mode like say KoboldAI's OPT/GPT/Neo/FSD models (although there are efforts to run llama there and someone has written a transformer already, but I think people will be disappointed once they get to try it).

It should also be clarified that many input characters need to be escaped if you don't want the model to just quit in the middle of the interactive mode.

gjmulder changed the title ~~Strange behaviour and answer~~ Crafting prompts to get LLaMA models to generate interesting content Mar 15, 2023

gjmulder closed this as completed Mar 15, 2023

gjmulder added good first issue Good for newcomers model Model specific labels Mar 15, 2023

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crafting prompts to get LLaMA models to generate interesting content #156

Crafting prompts to get LLaMA models to generate interesting content #156

paulocoutinhox commented Mar 15, 2023

ssvenn commented Mar 15, 2023

paulocoutinhox commented Mar 15, 2023

terafo commented Mar 15, 2023

paulocoutinhox commented Mar 15, 2023

terafo commented Mar 15, 2023 •

edited

Loading

G2G2G2G commented Mar 15, 2023 •

edited

Loading

paulocoutinhox commented Mar 15, 2023

gjmulder commented Mar 15, 2023

G2G2G2G commented Mar 15, 2023

paulocoutinhox commented Mar 15, 2023

gjmulder commented Mar 15, 2023 •

edited

Loading

paulocoutinhox commented Mar 15, 2023 •

edited

Loading

paulocoutinhox commented Mar 15, 2023

gjmulder commented Mar 15, 2023 •

edited

Loading

paulocoutinhox commented Mar 15, 2023 •

edited

Loading

gjmulder commented Mar 15, 2023

gjmulder commented Mar 15, 2023

levicki commented Mar 16, 2023 •

edited

Loading

Crafting prompts to get LLaMA models to generate interesting content #156

Crafting prompts to get LLaMA models to generate interesting content #156

Comments

paulocoutinhox commented Mar 15, 2023

ssvenn commented Mar 15, 2023

paulocoutinhox commented Mar 15, 2023

terafo commented Mar 15, 2023

paulocoutinhox commented Mar 15, 2023

terafo commented Mar 15, 2023 • edited Loading

G2G2G2G commented Mar 15, 2023 • edited Loading

paulocoutinhox commented Mar 15, 2023

gjmulder commented Mar 15, 2023

G2G2G2G commented Mar 15, 2023

paulocoutinhox commented Mar 15, 2023

gjmulder commented Mar 15, 2023 • edited Loading

paulocoutinhox commented Mar 15, 2023 • edited Loading

paulocoutinhox commented Mar 15, 2023

gjmulder commented Mar 15, 2023 • edited Loading

paulocoutinhox commented Mar 15, 2023 • edited Loading

gjmulder commented Mar 15, 2023

gjmulder commented Mar 15, 2023

levicki commented Mar 16, 2023 • edited Loading

terafo commented Mar 15, 2023 •

edited

Loading

G2G2G2G commented Mar 15, 2023 •

edited

Loading

gjmulder commented Mar 15, 2023 •

edited

Loading

paulocoutinhox commented Mar 15, 2023 •

edited

Loading

gjmulder commented Mar 15, 2023 •

edited

Loading

paulocoutinhox commented Mar 15, 2023 •

edited

Loading

levicki commented Mar 16, 2023 •

edited

Loading