Skip to content

Latest commit

 

History

History
38 lines (23 loc) · 2.54 KB

File metadata and controls

38 lines (23 loc) · 2.54 KB

Text generation Python sample that supports most popular models like LLaMA 3

This example showcases inference of text-generation Large Language Models (LLMs): chatglm, LLaMA, Qwen and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. It's only possible to change the device for inference to a different one, GPU for example, from the command line interface. The sample fearures openvino_genai.LLMPipeline and configures it to use multiple beam grops. There is also a Jupyter notebook which provides an example of LLM-powered Chatbot in Python.

Download and convert the model and tokenizers

The --upgrade-strategy eager option is needed to ensure optimum-intel is upgraded to the latest version.

Install ../../export-requirements.txt to convert a model.

pip install --upgrade-strategy eager -r ../../export-requirements.txt
optimum-cli export openvino --trust-remote-code --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0

Run

Install deployment-requirements.txt via pip install -r ../../deployment-requirements.txt and then, run a sample:

python beam_search_causal_lm.py TinyLlama-1.1B-Chat-v1.0 "Why is the Sun yellow?"

Discrete GPUs (dGPUs) usually provide better performance compared to CPUs. It is recommended to run larger models on a dGPU with 32GB+ RAM. For example, the model meta-llama/Llama-2-13b-chat-hf can benefit from being run on a dGPU. Modify the source code to change the device for inference to the GPU.

See https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md#supported-models for the list of supported models.

Troubleshooting

Unicode characters encoding error on Windows

Example error:

UnicodeEncodeError: 'charmap' codec can't encode character '\u25aa' in position 0: character maps to <undefined>

If you encounter the error described in the example when sample is printing output to the Windows console, it is likely due to the default Windows encoding not supporting certain Unicode characters. To resolve this:

  1. Enable Unicode characters for Windows cmd - open Region settings from Control panel. Administrative->Change system locale->Beta: Use Unicode UTF-8 for worldwide language support->OK. Reboot.
  2. Enable UTF-8 mode by setting environment variable PYTHONIOENCODING="utf8".