Skip to content

Commit 58b90f3

Browse files
Update llama.cpp integration (#11864)
<!-- - **Description:** removed redondant link, replaced it with Meta's LLaMA repo, add resources for models' hardware requirements, - **Issue:** None, - **Dependencies:** None, - **Tag maintainer:** None, - **Twitter handle:** @ElliotAlladaye -->
1 parent a228f34 commit 58b90f3

File tree

1 file changed

+8
-4
lines changed

1 file changed

+8
-4
lines changed

docs/docs/integrations/llms/llamacpp.ipynb

+8-4
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@
66
"source": [
77
"# Llama.cpp\n",
88
"\n",
9-
"[llama-cpp-python](https://github.com/abetlen/llama-cpp-python) is a Python binding for [llama.cpp](https://github.com/ggerganov/llama.cpp). \n",
9+
"[llama-cpp-python](https://github.com/abetlen/llama-cpp-python) is a Python binding for [llama.cpp](https://github.com/ggerganov/llama.cpp).\n",
1010
"\n",
11-
"It supports inference for [many LLMs](https://github.com/ggerganov/llama.cpp), which can be accessed on [HuggingFace](https://huggingface.co/TheBloke).\n",
11+
"It supports inference for [many LLMs](https://github.com/ggerganov/llama.cpp#description) models, which can be accessed on [HuggingFace](https://huggingface.co/TheBloke).\n",
1212
"\n",
1313
"This notebook goes over how to run `llama-cpp-python` within LangChain.\n",
1414
"\n",
@@ -54,7 +54,7 @@
5454
"source": [
5555
"### Installation with OpenBLAS / cuBLAS / CLBlast\n",
5656
"\n",
57-
"`lama.cpp` supports multiple BLAS backends for faster processing. Use the `FORCE_CMAKE=1` environment variable to force the use of cmake and install the pip package for the desired BLAS backend ([source](https://github.com/abetlen/llama-cpp-python#installation-with-openblas--cublas--clblast)).\n",
57+
"`llama.cpp` supports multiple BLAS backends for faster processing. Use the `FORCE_CMAKE=1` environment variable to force the use of cmake and install the pip package for the desired BLAS backend ([source](https://github.com/abetlen/llama-cpp-python#installation-with-openblas--cublas--clblast)).\n",
5858
"\n",
5959
"Example installation with cuBLAS backend:"
6060
]
@@ -177,7 +177,11 @@
177177
"\n",
178178
"You don't need an `API_TOKEN` as you will run the LLM locally.\n",
179179
"\n",
180-
"It is worth understanding which models are suitable to be used on the desired machine."
180+
"It is worth understanding which models are suitable to be used on the desired machine.\n",
181+
"\n",
182+
"[TheBloke's](https://huggingface.co/TheBloke) Hugging Face models have a `Provided files` section that exposes the RAM required to run models of different quantisation sizes and methods (eg: [Llama2-7B-Chat-GGUF](https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF#provided-files)).\n",
183+
"\n",
184+
"This [github issue](https://github.com/facebookresearch/llama/issues/425) is also relevant to find the right model for your machine."
181185
]
182186
},
183187
{

0 commit comments

Comments
 (0)