You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note the use of `--color` to distinguish between user input and generated text.
244
+
Note the use of `--color` to distinguish between user input and generated text. Other parameters are explained in more detail in the [README](examples/main/README.md) for the `main` example program.
Copy file name to clipboardexpand all lines: examples/main/README.md
+12-2
Original file line number
Diff line number
Diff line change
@@ -21,12 +21,20 @@ To get started right away, run the following command, making sure to use the cor
21
21
./main -m models/7B/ggml-model.bin --prompt "Once upon a time"
22
22
```
23
23
24
+
The following command generates "infinite" text from a starting prompt (you can use `Ctrl-C` to stop it):
25
+
26
+
```bash
27
+
./main -m models/7B/ggml-model.bin --ignore-eos --n_predict -1 --keep -1 --prompt "Once upon a time"
28
+
```
29
+
24
30
For an interactive experience, try this command:
25
31
26
32
```bash
27
33
./main -m models/7B/ggml-model.bin -n -1 --color -r "User:" --in-prefix "" --prompt $'User: Hi\nAI: Hello. I am an AI chatbot. Would you like to talk?\nUser: Sure!\nAI: What would you like to talk about?\nUser:'
28
34
```
29
35
36
+
Note that the newline characters in the prompt string above only work on Linux. On Windows, you will have to use the ``--file`` option (see below) to load a multi-line prompt from file instead.
37
+
30
38
## Common Options
31
39
32
40
In this section, we cover the most commonly used options for running the `main` program with the LLaMA models:
@@ -84,6 +92,8 @@ Instruction mode is particularly useful when working with Alpaca models, which a
84
92
85
93
-`-ins, --instruct`: Enable instruction mode to leverage the capabilities of Alpaca models in completing tasks based on user-provided instructions.
86
94
95
+
Technical detail: the user's input is internally prefixed with the reverse prompt (or ``### Instruction:`` as the default), and followed by ``### Response:`` (except if you just press Return without any input, to keep generating a longer response).
96
+
87
97
By understanding and utilizing these interaction options, you can create engaging and dynamic experiences with the LLaMA models, tailoring the text generation process to your specific needs.
88
98
89
99
## Context Management
@@ -114,7 +124,7 @@ The following options are related to controlling the text generation process, in
114
124
115
125
The `--n_predict` option controls the number of tokens the model generates in response to the input prompt. By adjusting this value, you can influence the length of the generated text. A higher value will result in longer text, while a lower value will produce shorter text. A value of -1 will cause text to be generated without limit.
116
126
117
-
It is important to note that the generated text may be shorter than the specified number of tokens if an End-of-Sequence (EOS) token or a reverse prompt is encountered. In interactive mode text generation will pause and control will be returned to the user. In non-interactive mode, the program will end. In both cases, the text generation may stop before reaching the specified `n_predict` value.
127
+
It is important to note that the generated text may be shorter than the specified number of tokens if an End-of-Sequence (EOS) token or a reverse prompt is encountered. In interactive mode text generation will pause and control will be returned to the user. In non-interactive mode, the program will end. In both cases, the text generation may stop before reaching the specified `n_predict` value. If you want the model to keep going without ever producing End-of-Sequence on its own, you can use the ``--ignore-eos`` parameter.
118
128
119
129
### RNG Seed
120
130
@@ -126,7 +136,7 @@ The RNG seed is used to initialize the random number generator that influences t
126
136
127
137
-`--temp N`: Adjust the randomness of the generated text (default: 0.8).
128
138
129
-
Temperature is a hyperparameter that controls the randomness of the generated text. It affects the probability distribution of the model's output tokens. A higher temperature (e.g., 1.5) makes the output more random and creative, while a lower temperature (e.g., 0.5) makes the output more focused, deterministic, and conservative. The default value is 0.8, which provides a balance between randomness and determinism.
139
+
Temperature is a hyperparameter that controls the randomness of the generated text. It affects the probability distribution of the model's output tokens. A higher temperature (e.g., 1.5) makes the output more random and creative, while a lower temperature (e.g., 0.5) makes the output more focused, deterministic, and conservative. The default value is 0.8, which provides a balance between randomness and determinism. At the extreme, a temperature of 0 will always pick the most likely next token, leading to identical outputs in each run.
0 commit comments