You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- BPE pre-tokenization support has been added: https://github.com/ggerganov/llama.cpp/pull/6920
24
25
- MoE memory layout has been updated - reconvert models for `mmap` support and regenerate `imatrix`https://github.com/ggerganov/llama.cpp/pull/6387
25
26
- Model sharding instructions using `gguf-split`https://github.com/ggerganov/llama.cpp/discussions/6404
26
27
- Fix major bug in Metal batched inference https://github.com/ggerganov/llama.cpp/pull/6225
@@ -175,6 +176,7 @@ Unless otherwise noted these projects are open-source with permissive licensing:
@@ -712,7 +714,7 @@ Building the program with BLAS support may lead to some performance improvements
712
714
713
715
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
714
716
715
-
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
717
+
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
716
718
717
719
```bash
718
720
# obtain the official LLaMA model weights and place them in ./models
@@ -935,25 +937,35 @@ If your issue is with model generation quality, then please at least scan the fo
935
937
936
938
### Android
937
939
938
-
#### Building the Project using Android NDK
939
-
You can easily run `llama.cpp` on Android device with [termux](https://termux.dev/).
940
+
#### Build on Android using Termux
941
+
[Termux](https://github.com/termux/termux-app#installation) is a method to execute `llama.cpp` on an Android device (no root required).
942
+
```
943
+
apt update && apt upgrade -y
944
+
apt install git make cmake
945
+
```
940
946
941
-
First, install the essential packages for termux:
947
+
It's recommended to move your model inside the `~/` directoryforbest performance:
942
948
```
943
-
pkg install clang wget git cmake
949
+
cd storage/downloads
950
+
mv model.gguf ~/
944
951
```
945
-
Second, obtain the [Android NDK](https://developer.android.com/ndk) and then build with CMake:
946
952
947
-
You can execute the following commands on your computer to avoid downloading the NDK to your mobile. Of course, you can also do this in Termux.
953
+
[Get the code](https://github.com/ggerganov/llama.cpp#get-the-code) & [follow the Linux build instructions](https://github.com/ggerganov/llama.cpp#build) to build `llama.cpp`.
954
+
955
+
#### Building the Project using Android NDK
956
+
Obtain the [Android NDK](https://developer.android.com/ndk) and then build with CMake.
948
957
958
+
Execute the following commands on your computer to avoid downloading the NDK to your mobile. Alternatively, you can also do this in Termux:
Install [termux](https://termux.dev/) on your device and run `termux-setup-storage` to get access to your SD card.
966
+
967
+
Install [termux](https://github.com/termux/termux-app#installation) on your device and run `termux-setup-storage` to get access to your SD card (if Android 11+ then run the command twice).
968
+
957
969
Finally, copy these built `llama` binaries and the model file to your device storage. Because the file permissions in the Android sdcard cannot be changed, you can copy the executable files to the `/data/data/com.termux/files/home/bin` path, and then execute the following commands in Termux to add executable permission:
958
970
959
971
(Assumed that you have pushed the built executable files to the /sdcard/llama.cpp/bin path using `adb push`)
[Termux](https://github.com/termux/termux-app#installation) is an alternative to execute `llama.cpp` on an Android device (no root required).
984
-
```
985
-
apt update && apt upgrade -y
986
-
apt install git
987
-
```
988
-
989
-
It's recommended to move your model inside the `~/` directory for best performance:
990
-
```
991
-
cd storage/downloads
992
-
mv model.gguf ~/
993
-
```
994
-
995
-
[Follow the Linux build instructions](https://github.com/ggerganov/llama.cpp#build) to build `llama.cpp`.
0 commit comments