From d85c47f13b18e0070120caa86f72a8c6b629bc86 Mon Sep 17 00:00:00 2001 From: Jing Xu Date: Mon, 13 May 2024 10:57:50 +0900 Subject: [PATCH] update 2.1.30 llm.html (#2876) --- xpu/2.1.30+xpu/_sources/tutorials/llm.rst.txt | 2 +- xpu/2.1.30+xpu/tutorials/llm.html | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/xpu/2.1.30+xpu/_sources/tutorials/llm.rst.txt b/xpu/2.1.30+xpu/_sources/tutorials/llm.rst.txt index a13deac7a..b21ce5314 100644 --- a/xpu/2.1.30+xpu/_sources/tutorials/llm.rst.txt +++ b/xpu/2.1.30+xpu/_sources/tutorials/llm.rst.txt @@ -48,7 +48,7 @@ Optimized Models *Note*: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and prepacked TPP Linear (fp16). For other LLMs families, we are working in progress to cover those optimizations, which will expand the model list above. -Check `LLM best known practice `_ for instructions to install/setup environment and example scripts.. +Check `LLM best known practice `_ for instructions to install/setup environment and example scripts.. Optimization Methodologies -------------------------- diff --git a/xpu/2.1.30+xpu/tutorials/llm.html b/xpu/2.1.30+xpu/tutorials/llm.html index 5d4b9db88..27866c623 100644 --- a/xpu/2.1.30+xpu/tutorials/llm.html +++ b/xpu/2.1.30+xpu/tutorials/llm.html @@ -163,7 +163,7 @@

Optimized ModelsLLM best known practice for instructions to install/setup environment and example scripts..

+

Check LLM best known practice for instructions to install/setup environment and example scripts..

Optimization Methodologies

@@ -260,4 +260,4 @@

Weight Only Quantization INT4