Updating build instructions to include BLAS support #1183

daniandtheweb · 2023-04-25T22:24:12Z

This pull request adds clear instructions about how to build llama.cpp on every platform with and without BLAS support.

First update to the build instructions to include BLAS.

slaren · 2023-04-25T22:40:01Z

Looks good, could we also include a brief notice explaining that BLAS is only used when the batch size is at least 32? Making clear that it only benefits prompt processing, but not generation.

For cuBLAS, it may also be useful to point that the CUDA toolkit can be obtained from https://developer.nvidia.com/cuda-downloads

daniandtheweb · 2023-04-25T23:01:06Z

Sure.
By installing the CUDA toolkit can Windows users build without any problem? I don't have a Nvidia GPU so I can't test it.

slaren · 2023-04-25T23:02:15Z

They also need Visual Studio Community, but yes, they should be able to build with the same cmake command line.

Adding a clearer BLAS explanation and adding a link to download the CUDA toolkit.

slaren · 2023-04-25T23:25:42Z

Great! Just one more thing, I suppose it is worth mentioning that on macs, BLAS is already supported by default through the Accelerate framework.

daniandtheweb · 2023-04-25T23:27:55Z

Ok, I'll just add that info then.

Specifying that BLAS is already supported on Macs using the Accelerate Framework.

README.md

SlyEcho · 2023-04-26T09:26:14Z

BLAS is only used when the batch size is at least 32

I don't think it's an issue any more, the default is now 512, is it not?

slaren · 2023-04-26T09:31:11Z

The batch still has to be at least 32 tokens, it won't be used with smaller prompts.

SlyEcho · 2023-04-26T09:35:24Z

BTW, @daniandtheweb yesterday I gave some messy instructions on how to get OpenBLAS llama.cpp on Windows: #1153

daniandtheweb · 2023-04-26T12:56:49Z

@SlyEcho Producing the build instructions for Windows in a clear way seems to add quite a lot of space to the readme. If that is not a problem I can just add all the needed instructions.

daniandtheweb · 2023-04-26T13:39:29Z

@SlyEcho I'm finishing a revised version of the build instructions to include make on Windows. Is there any reason why you recommended the fortran version of w64devkit?

SlyEcho · 2023-04-26T13:41:10Z

It's only when you want to build OpenBLAS yourself. Although, it may not be necessary. Why would you want to build it yourself? Then the library is smaller as it is optimized only for you machine.

daniandtheweb · 2023-04-26T13:42:25Z

So it should be OK to link the fortran version also for the normal build or it may be better to link the vanilla version for that?

SlyEcho · 2023-04-26T13:43:57Z

It doesn't really hurt anything. The reason I recommend w64devkit, is that it is just download and extract and it's ready to use. No need to install anything, nothing to clean up, just delete when you don't need it.

Added the instructions to build with Make on Windows

daniandtheweb · 2023-04-26T13:53:16Z

I think that it may be quite clear like this.

README.md

slaren · 2023-04-26T18:05:35Z

Please fix the failing EditorConfig check, there are some lines with trailing whitespaces.

daniandtheweb added 3 commits April 26, 2023 00:15

Updated build information

e2bb127

First update to the build instructions to include BLAS.

Update README.md

ab07da0

Update information about BLAS

e1b704b

Better BLAS explanation

5ac9074

Adding a clearer BLAS explanation and adding a link to download the CUDA toolkit.

Better BLAS explanation

2ff156d

BLAS for Mac

b6904cc

Specifying that BLAS is already supported on Macs using the Accelerate Framework.

daniandtheweb marked this pull request as ready for review April 25, 2023 23:33

slaren reviewed Apr 25, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Clarify the effect of BLAS

2ca73cb

slaren approved these changes Apr 25, 2023

View reviewed changes

daniandtheweb added 2 commits April 26, 2023 15:45

Windows Make instructions

8136918

Added the instructions to build with Make on Windows

Fixing typo

0bf8b5b

slaren reviewed Apr 26, 2023

View reviewed changes

README.md Show resolved Hide resolved

Fix trailing whitespace

5a51160

Green-Sky approved these changes Apr 26, 2023

View reviewed changes

slaren merged commit ea3ad7e into ggml-org:master Apr 26, 2023

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating build instructions to include BLAS support #1183

Updating build instructions to include BLAS support #1183

daniandtheweb commented Apr 25, 2023 •

edited

Loading

slaren commented Apr 25, 2023

daniandtheweb commented Apr 25, 2023

slaren commented Apr 25, 2023

slaren commented Apr 25, 2023

daniandtheweb commented Apr 25, 2023 •

edited

Loading

SlyEcho commented Apr 26, 2023

slaren commented Apr 26, 2023

SlyEcho commented Apr 26, 2023

daniandtheweb commented Apr 26, 2023 •

edited

Loading

daniandtheweb commented Apr 26, 2023

SlyEcho commented Apr 26, 2023

daniandtheweb commented Apr 26, 2023 •

edited

Loading

SlyEcho commented Apr 26, 2023

daniandtheweb commented Apr 26, 2023

slaren commented Apr 26, 2023

Updating build instructions to include BLAS support #1183

Updating build instructions to include BLAS support #1183

Conversation

daniandtheweb commented Apr 25, 2023 • edited Loading

slaren commented Apr 25, 2023

daniandtheweb commented Apr 25, 2023

slaren commented Apr 25, 2023

slaren commented Apr 25, 2023

daniandtheweb commented Apr 25, 2023 • edited Loading

SlyEcho commented Apr 26, 2023

slaren commented Apr 26, 2023

SlyEcho commented Apr 26, 2023

daniandtheweb commented Apr 26, 2023 • edited Loading

daniandtheweb commented Apr 26, 2023

SlyEcho commented Apr 26, 2023

daniandtheweb commented Apr 26, 2023 • edited Loading

SlyEcho commented Apr 26, 2023

daniandtheweb commented Apr 26, 2023

slaren commented Apr 26, 2023

daniandtheweb commented Apr 25, 2023 •

edited

Loading

daniandtheweb commented Apr 25, 2023 •

edited

Loading

daniandtheweb commented Apr 26, 2023 •

edited

Loading

daniandtheweb commented Apr 26, 2023 •

edited

Loading