Skip to content

Commit

Permalink
Merge releases/2024/3 into master (#731)
Browse files Browse the repository at this point in the history
Co-authored-by: Alina Kladieva <alina.kladieva@intel.com>
Co-authored-by: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com>
Co-authored-by: Nikita Malinin <nikita.malinin@intel.com>
Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com>
Co-authored-by: Anatoliy Talamanov <anatoliy.talamanov@intel.com>
Co-authored-by: Pavel Esir <pavel.esir@gmail.com>
Co-authored-by: Miłosz Żeglarski <milosz.zeglarski@intel.com>
Co-authored-by: Pavel Esir <pavel.esir@intel.com>
Co-authored-by: Alexander Suvorov <alexander.suvorov@intel.com>
Co-authored-by: Xiake Sun <xiake.sun@intel.com>
Co-authored-by: Damian Kalinowski <damian.kalinowski@intel.com>
Co-authored-by: Andrei Kochin <andrei.kochin@intel.com>
Co-authored-by: Ekaterina Aidova <ekaterina.aidova@intel.com>
Co-authored-by: guozhong wang <guozhong.wang@intel.com>
  • Loading branch information
15 people authored Aug 5, 2024
1 parent 3304798 commit dc9ef33
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 4 deletions.
2 changes: 1 addition & 1 deletion samples/python/chat_sample/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@ If you encounter an exception indicating a missing "chat template" when launchin
The following template can be used as a default, but it may not work properly with every model:
```
"chat_template": "{% for message in messages %}{% if (message['role'] == 'user') %}{{'<|im_start|>user\n' + message['content'] + '<|im_end|>\n<|im_start|>assistant\n'}}{% elif (message['role'] == 'assistant') %}{{message['content'] + '<|im_end|>\n'}}{% endif %}{% endfor %}",
```
```
8 changes: 5 additions & 3 deletions src/cpp/src/llm_pipeline_static.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -161,8 +161,10 @@ StaticLLMPipeline::StaticLLMPipeline(
*/
ov::Core core;
// (1) Read the template model - this will be kvcache model
auto kvcache_model = core.read_model(path / "openvino_model.xml");
// (2) TODO: Expose KV-cache input and output layers from kvcache model
m_kvcache_model = core.read_model(path / "openvino_model.xml");
// (2) Expose KV-cache input and output layers from kvcache model
ov::pass::StatefulToStateless().run_on_model(m_kvcache_model);
align_u4_zp_constants(m_kvcache_model);
// (3) Clone the model - this will be prefill
m_prefill_model = m_kvcache_model->clone();
m_prefill_model->set_friendly_name(m_kvcache_model->get_friendly_name() + "_prefill");
Expand All @@ -179,7 +181,7 @@ StaticLLMPipeline::StaticLLMPipeline(
m_prefill_model, device, extract_config_or_default(config, "PREFILL_CONFIG")
).create_infer_request();
m_kvcache_request = core.compile_model(
kvcache_model, device, extract_config_or_default(config, "GENERATE_CONFIG")
m_kvcache_model, device, extract_config_or_default(config, "GENERATE_CONFIG")
).create_infer_request();
// (7) Initialize tensors
prepare_for_new_conversation();
Expand Down

0 comments on commit dc9ef33

Please sign in to comment.