You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-1
Original file line number
Diff line number
Diff line change
@@ -310,6 +310,8 @@ Building the program with BLAS support may lead to some performance improvements
310
310
```
311
311
Note: Because llama.cpp uses multiple CUDA streams for matrix multiplication results [are not guaranteed to be reproducible](https://docs.nvidia.com/cuda/cublas/index.html#results-reproducibility). If you need reproducibility, set `GGML_CUDA_MAX_STREAMS` in the file `ggml-cuda.cu` to 1.
312
312
313
+
The environment variable [`CUDA_VISIBLE_DEVICES`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars) can be used to specify which GPU(s) will be used.
314
+
313
315
- **CLBlast**
314
316
315
317
OpenCL acceleration is provided by the matrix multiplication kernels from the [CLBlast](https://github.com/CNugteren/CLBlast) project and custom kernels for ggml that can generate tokens on the GPU.
@@ -348,7 +350,7 @@ Building the program with BLAS support may lead to some performance improvements
348
350
cmake --install . --prefix /some/path
349
351
```
350
352
351
-
Where `/some/path` is where the built library will be installed (default is `/usr/loca`l`).
353
+
Where `/some/path` is where the built library will be installed (default is `/usr/local`).
0 commit comments