Merge releases/2024/3 into master #731

Wovchena · 2024-08-02T13:39:34Z

No description provided.

Workaround Python_VERSION_MAJOR and MINOR not being set by replasing Python3 with Python Disable generation of some of the COMPONENTs not needed for GenAI. There are still unwanted empty archives, but they are generated uncounditionally by rapidjson.

…envinotoolkit#604) That allows LLMPipeline to create ContinuousBatchingPipeline as a backend. There's also a constructor accepting ireq, which can be used if the model was already transformed appropriately for ContinuousBatchingPipeline. But it feels it's going to be misleading and it simpler just to throw if such constructor is called with ContinuousBatchingPipeline backend.

) Port of PR: openvinotoolkit#615

Updated default configurations based on results from CVS-143530. (cherry picked from commit f460002)

Remove unwanted archives

…t#637)

Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com>

…#642) OpenVINOGenAITargets.cmake was excluded from packaging because CPACK_COMPONENTS_ALL is custom now and doesn't install Unspecified component

Co-authored-by: Pavel Esir <pavel.esir@gmail.com>

…oop for greedy sampling (openvinotoolkit#607) Searching for max element in a custom loop gives better performance than using std::max_element

Cherry picked from master

@Wovchena

@Wovchena, retarget to OV 24.3 release branch

- Added Readme for python tests - Added `--model_ids` option to run selectively only on specific models --------- Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>

…Readme

Symbols that cause errors: - `\u0643` - `\u25aa`

… optional plugin_config in tokenizer (openvinotoolkit#669) This improves performance of CB lib when tested within OVMS. Already merged to master: openvinotoolkit#651 This is cherry-pick

…oolkit#670) [mixtral-8x7b-instruct-v0.1-int4-ov](https://huggingface.co/OpenVINO/mixtral-8x7b-instruct-v0.1-int4-ov/) didn't have `generation_config.json` therefore generation continued ininitely. EOS_TOKEN_ID was red correctly but during generation it was not met. Updated docs so in every generate call max_new_tokens is set either in arguments or via default generation config `pipe.set_generation_config({'max_new_tokens': 100, 'num_beam_groups': 3, ...)` tickets: CVS-146933 CVS-146324

Co-authored-by: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com>

Port of openvinotoolkit#639

…_counters

- Added performance metrics and updated Readme with description how to use them - Added cpp and python sample for benchmarking Sample to calculate and visualize performance metrics. ``` import openvino_genai as ov_genai import tqdm import pandas as pd import matplotlib.pylab as pl pipe = ov_genai.LLMPipeline('TinyLlama-1.1B-Chat-v1.0/') config = ov_genai.GenerationConfig(max_new_tokens=15) metrics_df = pd.DataFrame(columns=['batch_size', 'throughput', 'ttft', 'tpot', 'std_throughput', 'std_ttft', 'std_tpot']) num_iter = 3 for batch_size in tqdm.tqdm([1, 2, 4, 16, 32, 64, 128]): prompts = ["The Sky is blue because"] * batch_size res = pipe.generate(prompts, config) metrics = res.perf_metrics for _ in range(num_iter - 1): res = pipe.generate(prompts, config) metrics += res.perf_metrics metrics_df = metrics_df._append({ 'throughput': metrics.get_throughput().mean, 'ttft': metrics.get_ttft().mean, 'tpot': metrics.get_tpot().mean, 'std_throughput': metrics.get_throughput().std, 'std_ttft': metrics.get_ttft().std, 'std_tpot': metrics.get_tpot().std, 'batch_size': batch_size, }, ignore_index=True) fig, axes = pl.subplots(nrows=3, ncols=1, figsize=(6, 8), sharex=True) axes[0].plot(metrics_df['batch_size'], metrics_df['throughput'], '-o') axes[1].plot(metrics_df['batch_size'], metrics_df['ttft'], '-o', ) axes[2].plot(metrics_df['batch_size'], metrics_df['tpot'], '-o') axes[0].set_ylabel('Throughput'), axes[1].set_ylabel('TTFT'), axes[2].set_ylabel('TPOT') axes[2].set_xlabel('Batch Size') axes[0].grid(True), axes[1].grid(True), axes[2].grid(True) pl.tight_layout() ``` ![image](https://github.com/user-attachments/assets/021a94b4-fc75-4b5f-90e6-60db471a3810) ticket: CVS-132859

Removing dockerfile from release branch due to process requirements.

Docstring for generation time metrics Ticket: CVS-132859

Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>

src/cpp/src/llm_pipeline_static.cpp

Wovchena and others added 30 commits July 15, 2024 13:48

Workaround (openvinotoolkit#618)

e4637b3

Workaround Python_VERSION_MAJOR and MINOR not being set by replasing Python3 with Python Disable generation of some of the COMPONENTs not needed for GenAI. There are still unwanted empty archives, but they are generated uncounditionally by rapidjson.

Revert to python3

423c8e3

Revert to python3 (openvinotoolkit#622)

8ad336c

Fix cmake Python var name (openvinotoolkit#624)

1b1b2f0

Clear beam search info when generate() is finished. (openvinotoolkit#630

f0c2677

) Port of PR: openvinotoolkit#615

Update nncf_utils.py (openvinotoolkit#616) (openvinotoolkit#633)

73badf6

Updated default configurations based on results from CVS-143530. (cherry picked from commit f460002)

Workaround cmake packaging (openvinotoolkit#634)

25655e3

Remove unwanted archives

Save licensing_genai into docs to align with OpenVINO (openvinotoolki…

754f6d7

…t#637)

Update submodule (openvinotoolkit#638)

e5247e0

Add Llama3 (openvinotoolkit#620)

2d1fa3b

Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com>

nightly->rc1 (openvinotoolkit#621)

489a87d

Add OpenVINOGenAITargets to core_genai_dev COMPONENT (openvinotoolkit…

67f0467

…#642) OpenVINOGenAITargets.cmake was excluded from packaging because CPACK_COMPONENTS_ALL is custom now and doesn't install Unspecified component

Apply todo, initialize detokenizer's cache (openvinotoolkit#647)

1969160

Cherry-pick static LLM pipeline changes (openvinotoolkit#654)

0e0f6a9

Co-authored-by: Pavel Esir <pavel.esir@gmail.com>

[Continuous batching] Replace standard max_element call with custom l…

cb100cb

…oop for greedy sampling (openvinotoolkit#607) Searching for max element in a custom loop gives better performance than using std::max_element

wip

f0e4190

add detokenization metric; refactor split to perf_conter & perf_metrics

7cab496

refactor structure, add python sample

bb1113c

Cherry-pick custom max_element loop (openvinotoolkit#662)

7bf42f1

Cherry picked from master

add more preicise durations

0a8f0d9

Add note for pybind ov::Tensor issue (openvinotoolkit#659)

bad01b9

[OV 24.3]Fix multinomial sample CMakeList (openvinotoolkit#658)

cb0da0a

@Wovchena, retarget to OV 24.3 release branch

add Readme for tests (openvinotoolkit#664)

bc92248

- Added Readme for python tests - Added `--model_ids` option to run selectively only on specific models --------- Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>

add cpp Readme, ensured correct batch processing, add PerfMetrics to …

90320f4

…Readme

use MeanStdPair

aeec730

[2024.3] Fix symbol encode error (openvinotoolkit#629)

56eeafc

Symbols that cause errors: - `\u0643` - `\u25aa`

[release branch] Add infer request queue for tokenizers and allow for…

8934a0e

… optional plugin_config in tokenizer (openvinotoolkit#669) This improves performance of CB lib when tested within OVMS. Already merged to master: openvinotoolkit#651 This is cherry-pick

Add CB naive chat (openvinotoolkit#644)

f9e45e1

Co-authored-by: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com>

popovaan and others added 21 commits July 26, 2024 06:51

Prefix caching. (openvinotoolkit#675)

406393f

Port of openvinotoolkit#639

Merge remote-tracking branch 'upstream/releases/2024/3' into add_perf…

c45aed5

…_counters

resolve conflicts

be2fdaf

apply comments

b00bcd8

uset getter and cache evaluate results

60e7188

update Readme's

e553ef5

StaticLLMPipeline dangling models hotfix (openvinotoolkit#693)

3bfbab5

Remove Dockerfile (openvinotoolkit#700)

06c57b7

Removing dockerfile from release branch due to process requirements.

StaticLLMPipeline - align u4 zero points (openvinotoolkit#705)

e286469

Disable broken test (openvinotoolkit#707)

2a80828

update optimum commit for releases/2024/3 (openvinotoolkit#711)

d89cdcb

change commit for optimum

2428a3a

Merge branch 'releases/2024/3' into ea/upd_opt_commit

1473e7f

change commit for optimum (openvinotoolkit#714)

8cb12b2

Add perf metric docstrings (openvinotoolkit#713)

2f778f3

Docstring for generation time metrics Ticket: CVS-132859

rc1->rc2 (openvinotoolkit#695)

2dc6b64

Docs for version compatibility (openvinotoolkit#692)

3bfdd3f

Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>

update requirements.txt (openvinotoolkit#721)

a295fe1

Merge branch 'releases/2024/3' into merge-releases/2024/3-into-master

4743003

fix merge

b30a262

Wovchena requested a review from TolyaTalamanov August 2, 2024 13:39

ilya-lavrenov approved these changes Aug 2, 2024

View reviewed changes

ilya-lavrenov added this to the 2024.3 milestone Aug 2, 2024

fix merge

fb80ce7

TolyaTalamanov reviewed Aug 2, 2024

View reviewed changes

src/cpp/src/llm_pipeline_static.cpp Show resolved Hide resolved

ilya-lavrenov added this pull request to the merge queue Aug 5, 2024

ilya-lavrenov self-assigned this Aug 5, 2024

Merged via the queue into openvinotoolkit:master with commit dc9ef33 Aug 5, 2024
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge releases/2024/3 into master #731

Merge releases/2024/3 into master #731

Wovchena commented Aug 2, 2024

Merge releases/2024/3 into master #731

Merge releases/2024/3 into master #731

Conversation

Wovchena commented Aug 2, 2024