This demo shows how to use gpt-2 model for inference to perform interactive conditional text prediction, where content is generated based on text provided by user.
On startup the demo application reads command line parameters and loads a model to OpenVINO™ Runtime plugin. It also encodes a user input prompt received via command line arguments or user input, and then uses it to predict the output sequence.
The list of models supported by the demo is in <omz_dir>/demos/gpt2_text_prediction_demo/python/models.lst
file.
This file can be used as a parameter for Model Downloader and Converter to download and, if necessary, convert models to OpenVINO IR format (*.xml + *.bin).
An example of using the Model Downloader:
omz_downloader --list models.lst
An example of using the Model Converter:
omz_converter --list models.lst
- gpt-2
NOTE: Refer to the tables Intel's Pre-Trained Models Device Support and Public Pre-Trained Models Device Support for the details on models inference support at different devices.
Running the application with the -h
option yields the following usage message:
usage: gpt2_text_prediction_demo.py [-h] -m MODEL -v VOCAB --merges MERGES
[-i INPUT]
[--max_sample_token_num MAX_SAMPLE_TOKEN_NUM]
[--top_k TOP_K] [--top_p TOP_P]
[-d DEVICE]
Options:
-h, --help Show this help message and exit.
-m MODEL, --model MODEL
Required. Path to an .xml file with a trained model
-v VOCAB, --vocab VOCAB
Required. Path to the vocabulary file with tokens
--merges MERGES Required. Path to the merges file
-i INPUT, --input INPUT
Optional. Input prompt
--max_sample_token_num MAX_SAMPLE_TOKEN_NUM
Optional. Maximum number of tokens in generated sample
--top_k TOP_K Optional. Number of tokens with the highest
probability which will be kept for generation
--top_p TOP_P Optional. Maximum probability, tokens with such a
probability and lower will be kept for generation
-d DEVICE, --device DEVICE
Optional. Target device to perform inference
on. Default value is CPU
--dynamic_shape Run model with dynamic input sequence. If not
provided, input sequence will be padded to
max_seq_len
--max_seq_len MAX_SEQ_LEN
Optional. Maximum sequence length for processing. Default value is 1024
The application reads and encodes text from input string, then performs transformations and uses it as model input.
The application outputs predicted text, continuing input string for each input string.
You can use the following command to try the demo (assuming the used model from the Open Model Zoo, downloaded and converted with the Model Downloader):
python3 gpt2_text_prediction_demo.py
--model=<path_to_model>/gpt-2.xml
--vocab=<models_dir>/models/public/gpt-2/gpt2/vocab.json
--merges=<models_dir>/models/public/gpt-2/gpt2/merges.txt