Skip to content

Updating build instructions to include BLAS support #1183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 26, 2023
Merged

Updating build instructions to include BLAS support #1183

merged 10 commits into from
Apr 26, 2023

Conversation

daniandtheweb
Copy link
Contributor

@daniandtheweb daniandtheweb commented Apr 25, 2023

This pull request adds clear instructions about how to build llama.cpp on every platform with and without BLAS support.

@slaren
Copy link
Member

slaren commented Apr 25, 2023

Looks good, could we also include a brief notice explaining that BLAS is only used when the batch size is at least 32? Making clear that it only benefits prompt processing, but not generation.

For cuBLAS, it may also be useful to point that the CUDA toolkit can be obtained from https://developer.nvidia.com/cuda-downloads

@daniandtheweb
Copy link
Contributor Author

Sure.
By installing the CUDA toolkit can Windows users build without any problem? I don't have a Nvidia GPU so I can't test it.

@slaren
Copy link
Member

slaren commented Apr 25, 2023

They also need Visual Studio Community, but yes, they should be able to build with the same cmake command line.

Adding a clearer BLAS explanation and adding a link to download the CUDA toolkit.
@slaren
Copy link
Member

slaren commented Apr 25, 2023

Great! Just one more thing, I suppose it is worth mentioning that on macs, BLAS is already supported by default through the Accelerate framework.

@daniandtheweb
Copy link
Contributor Author

daniandtheweb commented Apr 25, 2023

Ok, I'll just add that info then.

Specifying that BLAS is already supported on Macs using the Accelerate Framework.
@daniandtheweb daniandtheweb marked this pull request as ready for review April 25, 2023 23:33
@SlyEcho
Copy link
Collaborator

SlyEcho commented Apr 26, 2023

BLAS is only used when the batch size is at least 32

I don't think it's an issue any more, the default is now 512, is it not?

@slaren
Copy link
Member

slaren commented Apr 26, 2023

The batch still has to be at least 32 tokens, it won't be used with smaller prompts.

@SlyEcho
Copy link
Collaborator

SlyEcho commented Apr 26, 2023

BTW, @daniandtheweb yesterday I gave some messy instructions on how to get OpenBLAS llama.cpp on Windows: #1153

@daniandtheweb
Copy link
Contributor Author

daniandtheweb commented Apr 26, 2023

@SlyEcho Producing the build instructions for Windows in a clear way seems to add quite a lot of space to the readme. If that is not a problem I can just add all the needed instructions.

@daniandtheweb
Copy link
Contributor Author

@SlyEcho I'm finishing a revised version of the build instructions to include make on Windows. Is there any reason why you recommended the fortran version of w64devkit?

@SlyEcho
Copy link
Collaborator

SlyEcho commented Apr 26, 2023

It's only when you want to build OpenBLAS yourself. Although, it may not be necessary. Why would you want to build it yourself? Then the library is smaller as it is optimized only for you machine.

@daniandtheweb
Copy link
Contributor Author

daniandtheweb commented Apr 26, 2023

So it should be OK to link the fortran version also for the normal build or it may be better to link the vanilla version for that?

@SlyEcho
Copy link
Collaborator

SlyEcho commented Apr 26, 2023

It doesn't really hurt anything. The reason I recommend w64devkit, is that it is just download and extract and it's ready to use. No need to install anything, nothing to clean up, just delete when you don't need it.

Added the instructions to build with Make on Windows
@daniandtheweb
Copy link
Contributor Author

I think that it may be quite clear like this.

@slaren
Copy link
Member

slaren commented Apr 26, 2023

Please fix the failing EditorConfig check, there are some lines with trailing whitespaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants