Skip to content

Commit b8ee340

Browse files
zenixls2ggerganov
andauthored
feature : support blis and other blas implementation (ggml-org#1536)
* feature: add blis support * feature: allow all BLA_VENDOR to be assigned in cmake arguments. align with whisper.cpp pr 927 * fix: version detection for BLA_SIZEOF_INTEGER, recover min version of cmake * Fix typo in INTEGER Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Fix: blas changes on ci --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 parent 9ecb30f commit b8ee340

File tree

5 files changed

+105
-26
lines changed

5 files changed

+105
-26
lines changed

.github/workflows/build.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ jobs:
165165
- build: 'clblast'
166166
defines: '-DLLAMA_CLBLAST=ON -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/clblast"'
167167
- build: 'openblas'
168-
defines: '-DLLAMA_OPENBLAS=ON -DBLAS_LIBRARIES="/LIBPATH:$env:RUNNER_TEMP/openblas/lib" -DOPENBLAS_INC="$env:RUNNER_TEMP/openblas/include"'
168+
defines: '-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS="$env:RUNNER_TEMP/openblas/include"'
169169

170170
steps:
171171
- name: Clone

BLIS.md

+67
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
BLIS Installation Manual
2+
------------------------
3+
4+
BLIS is a portable software framework for high-performance BLAS-like dense linear algebra libraries. It has received awards and recognition, including the 2023 James H. Wilkinson Prize for Numerical Software and the 2020 SIAM Activity Group on Supercomputing Best Paper Prize. BLIS provides a new BLAS-like API and a compatibility layer for traditional BLAS routine calls. It offers features such as object-based API, typed API, BLAS and CBLAS compatibility layers.
5+
6+
Project URL: https://github.com/flame/blis
7+
8+
### Prepare:
9+
10+
Compile BLIS:
11+
12+
```bash
13+
git clone https://github.com/flame/blis
14+
cd blis
15+
./configure --enable-cblas -t openmp,pthreads auto
16+
# will install to /usr/local/ by default.
17+
make -j
18+
```
19+
20+
Install BLIS:
21+
22+
```bash
23+
sudo make install
24+
```
25+
26+
We recommend using openmp since it's easier to modify the cores been used.
27+
28+
### llama.cpp compilation
29+
30+
Makefile:
31+
32+
```bash
33+
make LLAMA_BLIS=1 -j
34+
# make LLAMA_BLIS=1 benchmark-matmult
35+
```
36+
37+
CMake:
38+
39+
```bash
40+
mkdir build
41+
cd build
42+
cmake -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=FLAME ..
43+
make -j
44+
```
45+
46+
### llama.cpp execution
47+
48+
According to the BLIS documentation, we could set the following
49+
environment variables to modify the behavior of openmp:
50+
51+
```
52+
export GOMP_GPU_AFFINITY="0-19"
53+
export BLIS_NUM_THREADS=14
54+
```
55+
56+
And then run the binaries as normal.
57+
58+
59+
### Intel specific issue
60+
61+
Some might get the error message saying that `libimf.so` cannot be found.
62+
Please follow this [stackoverflow page](https://stackoverflow.com/questions/70687930/intel-oneapi-2022-libimf-so-no-such-file-or-directory-during-openmpi-compila).
63+
64+
### Reference:
65+
66+
1. https://github.com/flame/blis#getting-started
67+
2. https://github.com/flame/blis/blob/master/docs/Multithreading.md

CMakeLists.txt

+16-23
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,8 @@ endif()
6565

6666
# 3rd party libs
6767
option(LLAMA_ACCELERATE "llama: enable Accelerate framework" ON)
68-
option(LLAMA_OPENBLAS "llama: use OpenBLAS" OFF)
68+
option(LLAMA_BLAS "llama: use BLAS" OFF)
69+
option(LLAMA_BLAS_VENDOR "llama: BLA_VENDOR from https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors" Generic)
6970
option(LLAMA_CUBLAS "llama: use cuBLAS" OFF)
7071
option(LLAMA_CLBLAST "llama: use CLBlast" OFF)
7172

@@ -145,36 +146,28 @@ if (APPLE AND LLAMA_ACCELERATE)
145146
endif()
146147
endif()
147148

148-
if (LLAMA_OPENBLAS)
149+
if (LLAMA_BLAS)
149150
if (LLAMA_STATIC)
150151
set(BLA_STATIC ON)
151152
endif()
152-
153-
set(BLA_VENDOR OpenBLAS)
153+
if ($(CMAKE_VERSION) VERSION_GREATER_EQUAL 3.22)
154+
set(BLA_SIZEOF_INTEGER 8)
155+
endif()
156+
set(BLA_VENDOR ${LLAMA_BLAS_VENDOR})
154157
find_package(BLAS)
155158
if (BLAS_FOUND)
156-
message(STATUS "OpenBLAS found")
159+
message(STATUS "BLAS found, Libraries: ${BLAS_LIBRARIES}")
157160

161+
add_compile_options(${BLAS_LINKER_FLAGS})
158162
add_compile_definitions(GGML_USE_OPENBLAS)
159-
add_link_options(${BLAS_LIBRARIES})
160-
set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} openblas)
161-
162-
# find header file
163-
set(OPENBLAS_INCLUDE_SEARCH_PATHS
164-
/usr/include
165-
/usr/include/openblas
166-
/usr/include/openblas-base
167-
/usr/local/include
168-
/usr/local/include/openblas
169-
/usr/local/include/openblas-base
170-
/opt/OpenBLAS/include
171-
$ENV{OpenBLAS_HOME}
172-
$ENV{OpenBLAS_HOME}/include
173-
)
174-
find_path(OPENBLAS_INC NAMES cblas.h PATHS ${OPENBLAS_INCLUDE_SEARCH_PATHS})
175-
add_compile_options(-I${OPENBLAS_INC})
163+
set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${BLAS_LIBRARIES})
164+
165+
message("${BLAS_LIBRARIES} ${BLAS_INCLUDE_DIRS}")
166+
include_directories(${BLAS_INCLUDE_DIRS})
176167
else()
177-
message(WARNING "OpenBLAS not found")
168+
message(WARNING "BLAS not found, please refer to "
169+
"https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors"
170+
" to set correct LLAMA_BLAS_VENDOR")
178171
endif()
179172
endif()
180173

Makefile

+4
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,10 @@ ifdef LLAMA_OPENBLAS
122122
LDFLAGS += -lopenblas
123123
endif
124124
endif
125+
ifdef LLAMA_BLIS
126+
CFLAGS += -DGGML_USE_OPENBLAS -I/usr/local/include/blis -I/usr/include/blis
127+
LDFLAGS += -lblis -L/usr/local/lib
128+
endif
125129
ifdef LLAMA_CUBLAS
126130
CFLAGS += -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I$(CUDA_PATH)/targets/x86_64-linux/include
127131
CXXFLAGS += -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I$(CUDA_PATH)/targets/x86_64-linux/include

README.md

+17-2
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quant
5656
- Mixed F16 / F32 precision
5757
- 4-bit, 5-bit and 8-bit integer quantization support
5858
- Runs on the CPU
59-
- OpenBLAS support
59+
- Supports OpenBLAS/Apple BLAS/ARM Performance Lib/ATLAS/BLIS/Intel MKL/NVHPC/ACML/SCSL/SGIMATH and [more](https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors) in BLAS
6060
- cuBLAS and CLBlast support
6161

6262
The original implementation of `llama.cpp` was [hacked in an evening](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022).
@@ -274,10 +274,25 @@ Building the program with BLAS support may lead to some performance improvements
274274
```bash
275275
mkdir build
276276
cd build
277-
cmake .. -DLLAMA_OPENBLAS=ON
277+
cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS
278278
cmake --build . --config Release
279279
```
280280

281+
- BLIS
282+
283+
Check [BLIS.md](BLIS.md) for more information.
284+
285+
- Intel MKL
286+
287+
By default, `LLAMA_BLAS_VENDOR` is set to `Generic`, so if you already sourced intel environment script and assign `-DLLAMA_BLAS=ON` in cmake, the mkl version of Blas will automatically been selected. You may also specify it by:
288+
289+
```bash
290+
mkdir build
291+
cd build
292+
cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
293+
cmake --build . -config Release
294+
```
295+
281296
- cuBLAS
282297

283298
This provides BLAS acceleration using the CUDA cores of your Nvidia GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager or from here: [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads).

0 commit comments

Comments
 (0)