You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- BPE pre-tokenization support has been added: https://github.com/ggerganov/llama.cpp/pull/6920
24
25
- MoE memory layout has been updated - reconvert models for `mmap` support and regenerate `imatrix`https://github.com/ggerganov/llama.cpp/pull/6387
25
26
- Model sharding instructions using `gguf-split`https://github.com/ggerganov/llama.cpp/discussions/6404
26
27
- Fix major bug in Metal batched inference https://github.com/ggerganov/llama.cpp/pull/6225
@@ -139,7 +140,6 @@ Typically finetunes of the base models below are supported as well.
@@ -712,7 +712,7 @@ Building the program with BLAS support may lead to some performance improvements
712
712
713
713
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
714
714
715
-
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
715
+
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
716
716
717
717
```bash
718
718
# obtain the official LLaMA model weights and place them in ./models
@@ -935,25 +935,35 @@ If your issue is with model generation quality, then please at least scan the fo
935
935
936
936
### Android
937
937
938
-
#### Building the Project using Android NDK
939
-
You can easily run `llama.cpp` on Android device with [termux](https://termux.dev/).
938
+
#### Build on Android using Termux
939
+
[Termux](https://github.com/termux/termux-app#installation) is a method to execute `llama.cpp` on an Android device (no root required).
940
+
```
941
+
apt update && apt upgrade -y
942
+
apt install git make cmake
943
+
```
940
944
941
-
First, install the essential packages for termux:
945
+
It's recommended to move your model inside the `~/` directoryforbest performance:
942
946
```
943
-
pkg install clang wget git cmake
947
+
cd storage/downloads
948
+
mv model.gguf ~/
944
949
```
945
-
Second, obtain the [Android NDK](https://developer.android.com/ndk) and then build with CMake:
946
950
947
-
You can execute the following commands on your computer to avoid downloading the NDK to your mobile. Of course, you can also do this in Termux.
951
+
[Get the code](https://github.com/ggerganov/llama.cpp#get-the-code) & [follow the Linux build instructions](https://github.com/ggerganov/llama.cpp#build) to build `llama.cpp`.
952
+
953
+
#### Building the Project using Android NDK
954
+
Obtain the [Android NDK](https://developer.android.com/ndk) and then build with CMake.
948
955
956
+
Execute the following commands on your computer to avoid downloading the NDK to your mobile. Alternatively, you can also do this in Termux:
Install [termux](https://termux.dev/) on your device and run `termux-setup-storage` to get access to your SD card.
964
+
965
+
Install [termux](https://github.com/termux/termux-app#installation) on your device and run `termux-setup-storage` to get access to your SD card (if Android 11+ then run the command twice).
966
+
957
967
Finally, copy these built `llama` binaries and the model file to your device storage. Because the file permissions in the Android sdcard cannot be changed, you can copy the executable files to the `/data/data/com.termux/files/home/bin` path, and then execute the following commands in Termux to add executable permission:
958
968
959
969
(Assumed that you have pushed the built executable files to the /sdcard/llama.cpp/bin path using `adb push`)
[Termux](https://github.com/termux/termux-app#installation) is an alternative to execute `llama.cpp` on an Android device (no root required).
984
-
```
985
-
apt update && apt upgrade -y
986
-
apt install git
987
-
```
988
-
989
-
It's recommended to move your model inside the `~/` directory for best performance:
990
-
```
991
-
cd storage/downloads
992
-
mv model.gguf ~/
993
-
```
994
-
995
-
[Follow the Linux build instructions](https://github.com/ggerganov/llama.cpp#build) to build `llama.cpp`.
0 commit comments