Add python samples #21

Wovchena · 2024-06-05T16:35:19Z

No description provided.

LLM return logits with probabilities of each token, these probabilities can be converted to tokens/words with different technics: greedy decoding, beam search decoding, random sampling, etc. This requires writing user unfriendly post-processing even for the simplest scenario of greedy decoding. In order to make live easier we we combined all decoding scenarios into a single function call, where the decoding method and parameters are specified by arguments. In this PR we provide a user friendly API for text generation inspired by `generate` method from HuggingFace transformers library. - [x] enable calling tokenizers/detokenizers from LLMPipeline - [ ] add callback for streaming mode - done partially, need to improve - [x] rewritten samples with the current approach: [causal_lm/cpp/generate_pipeline/generate_sample.cpp#L73-L83](https://github.com/pavel-esir/openvino.genai/blob/generate_pipeline/text_generation/causal_lm/cpp/generate_pipeline/generate_sample.cpp#L73-L83) - [x] Multibatch greedy decoding - [ ] Speculative decoding - [ ] Grouped Beam Search decoding: ready for batch 1, need to rebase multibatch support after merging openvinotoolkit#349 - [x] Random sampling Example 1: Greedy search generation ``` LLMPipeline pipe(model_path, device); // Will try to load config from generation_config.json. // but if not found default velues for gready search will be used GenerationConfig config = pipe.generation_config(); cout << pipe(prompt, config.max_new_tokens(20)); ``` Example 2: TextStreaming mode ``` LLMPipeline pipe(model_path, device); GenerationConfig config = pipe.generation_config(); auto text_streamer = TextStreamer{pipe}; auto text_streamer_callback = [&text_streamer](std::vector<int64_t>&& tokens, LLMPipeline& pipe){ text_streamer.put(tokens[0]); }; pipe(prompt, config.max_new_tokens(20).set_callback(text_streamer_callback)); text_streamer.end(); ``` CVS-132907 CVS-137920 --------- Co-authored-by: Wovchena <vladimir.zlobin@intel.com> Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com> Co-authored-by: Alexander Suvorov <alexander.suvorov@intel.com> Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com> Co-authored-by: Xiake Sun <xiake.sun@intel.com> Co-authored-by: wenyi5608 <93560477+wenyi5608@users.noreply.github.com> Co-authored-by: Ekaterina Aidova <ekaterina.aidova@intel.com> Co-authored-by: guozhong wang <guozhong.wang@intel.com> Co-authored-by: Chen Peter <peter.chen@intel.com>

Wovchena added 30 commits June 5, 2024 19:50

Add beam_search_causal_lm.py

e1502d6

indentation

f45221f

compare cpp and py

767db1f

-

caa7438

source

aed5b37

indent

7e4abad

shell: cmd

e190226

remove indentation

13977be

indentation

14deab1

chmod

c694770

PYTHONPATH

63956a5

indentation

782642e

sh

5eec344

flush

0d40c5a

verbose

ff28df5

Merge branch 'generate_pipeline' into add-beam_search_causal_lm.py

1b57c47

Debug

6da2340

align samples

5abb994

cmod

b6a0808

correct compilation

24819d2

add samples/requirements.txt

6c893fe

pythonpath

5839264

correct install path

7da29c6

set_target_properties

557baa9

USE_SOURCE_PERMISSIONS

f0bdeff

.texts

a308424

dont compare Qwen-7B-Chat

8027e15

&&

6c64941

correct assert

59bd1fa

correct zip

a71e60a

Wovchena added 10 commits June 7, 2024 11:04

set_property

8e53f42

and

deb543f

and

aa21014

TinyLlama-1.1B-Chat-v1.0

8c4b7a1

rc2

66dc3cb

path

5c4fc77

remove preint

3f7e9de

correct

61dec7c

Merge branch 'generate_pipeline' into add-beam_search_causal_lm.py

a8726e0

revert

dd6d19a

Wovchena changed the title ~~Add beam_search_causal_lm.py~~ Add python samples Jun 7, 2024

Wovchena and others added 18 commits June 7, 2024 13:51

more

ed65cfb

print

cbec967

Merge branch 'generate_pipeline' into add-beam_search_causal_lm.py

e01295f

and

6977510

reorder

1e8a17e

smaller

bba9b89

simplify

8ec3e07

remove sx

38fbe6b

2

9c3eac1

retrigger

ae3555e

openvino.serialize

4faba69

1

5f55850

remove optimum-intel

26c3c40

Merge branch 'releases/2024/2' into add-beam_search_causal_lm.py

0abcb61

correct

8ecb3bd

revert

04dd619

align

29e4258

Wovchena closed this Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add python samples #21

Add python samples #21

Wovchena commented Jun 5, 2024

Add python samples #21

Add python samples #21

Conversation

Wovchena commented Jun 5, 2024