Add max_new_tokens to every generate call in src/README.md (#670)

[mixtral-8x7b-instruct-v0.1-int4-ov](https://huggingface.co/OpenVINO/mixtral-8x7b-instruct-v0.1-int4-ov/) didn't have `generation_config.json` therefore generation continued ininitely. EOS_TOKEN_ID was red correctly but during generation it was not met. Updated docs so in every generate call max_new_tokens is set either in arguments or via default generation config `pipe.set_generation_config({'max_new_tokens': 100, 'num_beam_groups': 3, ...)` tickets: CVS-146933 CVS-146324
openvinotoolkit · Jul 24, 2024 · 12f8e44 · 12f8e44
1 parent 8934a0e
commit 12f8e44
Showing 1 changed file with 6 additions and 6 deletions.
diff --git a/src/README.md b/src/README.md
@@ -42,15 +42,15 @@ A simple example:
 ```python
 import openvino_genai as ov_genai
 pipe = ov_genai.LLMPipeline(model_path, "CPU")
-print(pipe.generate("The Sun is yellow because"))
+print(pipe.generate("The Sun is yellow because", max_new_tokens=100))
 ```
 
 Calling generate with custom generation config parameters, e.g. config for grouped beam search:
 ```python
 import openvino_genai as ov_genai
 pipe = ov_genai.LLMPipeline(model_path, "CPU")
 
-result = pipe.generate("The Sun is yellow because", max_new_tokens=30, num_beam_groups=3, num_beams=15, diversity_penalty=1.5)
+result = pipe.generate("The Sun is yellow because", max_new_tokens=100, num_beam_groups=3, num_beams=15, diversity_penalty=1.5)
 print(result)
 ```
 
@@ -73,7 +73,7 @@ while True:
     prompt = input()
     if prompt == 'Stop!':
         break
-    print(pipe(prompt))
+    print(pipe(prompt, max_new_tokens=200))
 pipe.finish_chat()
 ```
 
@@ -89,7 +89,7 @@ A simple example:
 int main(int argc, char* argv[]) {
     std::string model_path = argv[1];
     ov::genai::LLMPipeline pipe(model_path, "CPU");
-    std::cout << pipe.generate("The Sun is yellow because");
+    std::cout << pipe.generate("The Sun is yellow because", ov::genai::max_new_tokens(256));
 }
 ```
 
@@ -159,7 +159,7 @@ int main(int argc, char* argv[]) {
         // false means continue generation.
         return false;
     };
-    std::cout << pipe.generate("The Sun is yellow bacause", ov::genai::streamer(streamer));
+    std::cout << pipe.generate("The Sun is yellow bacause", ov::genai::streamer(streamer), ov::genai::max_new_tokens(200));
 }
 ```
 
@@ -192,7 +192,7 @@ int main(int argc, char* argv[]) {
 
     std::string model_path = argv[1];
     ov::genai::LLMPipeline pipe(model_path, "CPU");
-    std::cout << pipe.generate("The Sun is yellow because", ov::genai::streamer(custom_streamer));
+    std::cout << pipe.generate("The Sun is yellow because", ov::genai::streamer(custom_streamer), ov::genai::max_new_tokens(200));
 }
 ```