-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA backend #2310
Merged
Merged
CUDA backend #2310
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Kompute is no longer working. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
General: - Proper implementation of gpuDeviceName() - Make usingGPUDevice() consistent with Kompute impl - Disable multi-GPU when selecting a specific device (currently: always) For the bindings: - Abort instead of segfaulting if multiple LLMs are loaded - Implement GPU device selection by name/vendor Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
force-pushed
the
add-cuda-support
branch
from
May 7, 2024 14:54
c67f868
to
1a10587
Compare
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
force-pushed
the
add-cuda-support
branch
from
May 7, 2024 21:20
9474231
to
52370d9
Compare
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
force-pushed
the
add-cuda-support
branch
from
May 7, 2024 21:27
52370d9
to
2417105
Compare
This file is part of the graphics driver and should not be bundled with GPT4All. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
force-pushed
the
add-cuda-support
branch
from
May 7, 2024 22:31
ea6e118
to
9e8f7c3
Compare
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
llama.cpp itself is unconditionally built as a static library. Installing it with the GUI is pointless. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Can you create an offline installer with this change for linux for testing please? |
manyoso
reviewed
May 15, 2024
manyoso
reviewed
May 15, 2024
manyoso
requested changes
May 15, 2024
This will be a big release, so increment the minor version. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
manyoso
approved these changes
May 15, 2024
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
added a commit
that referenced
this pull request
May 15, 2024
I don't understand why this is needed, but it works. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
added a commit
that referenced
this pull request
May 15, 2024
manyoso
pushed a commit
that referenced
this pull request
May 15, 2024
cebtenzzre
added a commit
that referenced
this pull request
May 15, 2024
cebtenzzre
added a commit
that referenced
this pull request
May 21, 2024
This matters now that #2310 removed the default of "Release" in llama.cpp.cmake. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
added a commit
that referenced
this pull request
May 28, 2024
n_ubatch defaults to 512, but as of the latest llama.cpp you cannot pass more than n_ubatch tokens to the embedding model without hitting an assertion failure. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre
added a commit
that referenced
this pull request
May 29, 2024
This was referenced Jun 4, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds opt-in CUDA support in the GPT4All UI and python bindings using the llama.cpp CUDA backend.
CUDA-enabled devices will appear as e.g. "CUDA: Tesla P40" on supported platforms, alongside their Vulkan counterparts. When one is selected, the CUDA backend will be used instead of the Kompute backend.
When CUDA is not available (e.g. a compatible driver or GPU is not installed), CUDA devices simply do not appear. OOM will cause fallback to CPU, just like with Kompute.
The CUDA runtime libraries are installed to the lib/ directory on Linux and Windows (cudart and cublas). Care is taken to make sure the driver component of CUDA, libnvcuda.dll/libcuda.so, is not installed, as it should come with graphics driver.
Other changes:
While I was working on packaging the CUDA libraries, I cleaned up a lot of junk that was installed but not needed: