Skip to content

Commit

Permalink
update 2.1.30 llm.html (#2876)
Browse files Browse the repository at this point in the history
  • Loading branch information
jingxu10 authored May 13, 2024
1 parent 9700007 commit d85c47f
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion xpu/2.1.30+xpu/_sources/tutorials/llm.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Optimized Models

*Note*: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and prepacked TPP Linear (fp16). For other LLMs families, we are working in progress to cover those optimizations, which will expand the model list above.

Check `LLM best known practice <https://github.com/intel/intel-extension-for-pytorch/tree/v2.1.30%2Bxpu/examples/gpu/inference/python/llm>`_ for instructions to install/setup environment and example scripts..
Check `LLM best known practice <https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.1.30/examples/gpu/inference/python/llm>`_ for instructions to install/setup environment and example scripts..

Optimization Methodologies
--------------------------
Expand Down
4 changes: 2 additions & 2 deletions xpu/2.1.30+xpu/tutorials/llm.html
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ <h2>Optimized Models<a class="headerlink" href="#optimized-models" title="Permal
</tbody>
</table>
<p><em>Note</em>: The above verified models (including other models in the same model family, like “codellama/CodeLlama-7b-hf” from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and prepacked TPP Linear (fp16). For other LLMs families, we are working in progress to cover those optimizations, which will expand the model list above.</p>
<p>Check <a class="reference external" href="https://github.com/intel/intel-extension-for-pytorch/tree/v2.1.30%2Bxpu/examples/gpu/inference/python/llm">LLM best known practice</a> for instructions to install/setup environment and example scripts..</p>
<p>Check <a class="reference external" href="https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.1.30/examples/gpu/inference/python/llm">LLM best known practice</a> for instructions to install/setup environment and example scripts..</p>
</section>
<section id="optimization-methodologies">
<h2>Optimization Methodologies<a class="headerlink" href="#optimization-methodologies" title="Permalink to this heading"></a></h2>
Expand Down Expand Up @@ -260,4 +260,4 @@ <h2>Weight Only Quantization INT4<a class="headerlink" href="#weight-only-quanti
</script>

</body>
</html>
</html>

0 comments on commit d85c47f

Please sign in to comment.