Skip to content

Commit

Permalink
docs: add numpy issue to troubleshooting
Browse files Browse the repository at this point in the history
  • Loading branch information
jaluma committed Aug 7, 2024
1 parent ca2b8da commit 80830a4
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 5 deletions.
10 changes: 6 additions & 4 deletions fern/docs/pages/installation/installation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -307,11 +307,12 @@ If you have all required dependencies properly configured running the
following powershell command should succeed.

```powershell
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
```

If your installation was correct, you should see a message similar to the following next
time you start the server `BLAS = 1`.
time you start the server `BLAS = 1`. If there is some issue, please refer to the
[troubleshooting](#/installation/getting-started/troubleshooting#guide-for-building-llama-cpp-with-cuda-support) section.

```console
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
Expand Down Expand Up @@ -339,11 +340,12 @@ Some tips:
After that running the following command in the repository will install llama.cpp with GPU support:

```bash
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
```

If your installation was correct, you should see a message similar to the following next
time you start the server `BLAS = 1`.
time you start the server `BLAS = 1`. If there is some issue, please refer to the
[troubleshooting](#/installation/getting-started/troubleshooting#guide-for-building-llama-cpp-with-cuda-support) section.

```
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
Expand Down
17 changes: 16 additions & 1 deletion fern/docs/pages/installation/troubleshooting.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,19 @@ huggingface:
embedding:
embed_dim: 384
```
</Callout>
</Callout>

# Building Llama-cpp with NVIDIA GPU support

## Out-of-memory error

If you encounter an out-of-memory error while running `llama-cpp` with CUDA, you can try the following steps to resolve the issue:
1. **Set the next environment:**
```bash
TOKENIZERS_PARALLELISM=true
```
2. **Run PrivateGPT:**
```bash
poetry run python -m privategpt
```
Give thanks to [MarioRossiGithub](https://github.com/MarioRossiGithub) for providing the following solution.

0 comments on commit 80830a4

Please sign in to comment.