Skip to content

Commit 01b7ccd

Browse files
authored
fix(config): make tokenizer optional and include a troubleshooting doc (#1998)
* docs: add troubleshooting * fix: pass HF token to setup script and prevent to download tokenizer when it is empty * fix: improve log and disable specific tokenizer by default * chore: change HF_TOKEN environment to be aligned with default config * ifx: mypy
1 parent 15f73db commit 01b7ccd

File tree

6 files changed

+65
-12
lines changed

6 files changed

+65
-12
lines changed

fern/docs.yml

+2
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ navigation:
4141
path: ./docs/pages/installation/concepts.mdx
4242
- page: Installation
4343
path: ./docs/pages/installation/installation.mdx
44+
- page: Troubleshooting
45+
path: ./docs/pages/installation/troubleshooting.mdx
4446
# Manual of privateGPT: how to use it and configure it
4547
- tab: manual
4648
layout:

fern/docs/pages/installation/installation.mdx

+2
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,8 @@ set PGPT_PROFILES=ollama
8181
make run
8282
```
8383

84+
Refer to the [troubleshooting](./troubleshooting) section for specific issues you might encounter.
85+
8486
### Local, Ollama-powered setup - RECOMMENDED
8587

8688
**The easiest way to run PrivateGPT fully locally** is to depend on Ollama for the LLM. Ollama provides local LLM and Embeddings super easy to install and use, abstracting the complexity of GPU support. It's the recommended setup for local development.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Downloading Gated and Private Models
2+
3+
Many models are gated or private, requiring special access to use them. Follow these steps to gain access and set up your environment for using these models.
4+
5+
## Accessing Gated Models
6+
7+
1. **Request Access:**
8+
Follow the instructions provided [here](https://huggingface.co/docs/hub/en/models-gated) to request access to the gated model.
9+
10+
2. **Generate a Token:**
11+
Once you have access, generate a token by following the instructions [here](https://huggingface.co/docs/hub/en/security-tokens).
12+
13+
3. **Set the Token:**
14+
Add the generated token to your `settings.yaml` file:
15+
16+
```yaml
17+
huggingface:
18+
access_token: <your-token>
19+
```
20+
21+
Alternatively, set the `HF_TOKEN` environment variable:
22+
23+
```bash
24+
export HF_TOKEN=<your-token>
25+
```
26+
27+
# Tokenizer Setup
28+
29+
PrivateGPT uses the `AutoTokenizer` library to tokenize input text accurately. It connects to HuggingFace's API to download the appropriate tokenizer for the specified model.
30+
31+
## Configuring the Tokenizer
32+
33+
1. **Specify the Model:**
34+
In your `settings.yaml` file, specify the model you want to use:
35+
36+
```yaml
37+
llm:
38+
tokenizer: mistralai/Mistral-7B-Instruct-v0.2
39+
```
40+
41+
2. **Set Access Token for Gated Models:**
42+
If you are using a gated model, ensure the `access_token` is set as mentioned in the previous section.
43+
44+
This configuration ensures that PrivateGPT can download and use the correct tokenizer for the model you are working with.

private_gpt/components/llm/llm_component.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,10 @@ def __init__(self, settings: Settings) -> None:
3535
)
3636
except Exception as e:
3737
logger.warning(
38-
"Failed to download tokenizer %s. Falling back to "
39-
"default tokenizer.",
40-
settings.llm.tokenizer,
41-
e,
38+
f"Failed to download tokenizer {settings.llm.tokenizer}: {e!s}"
39+
f"Please follow the instructions in the documentation to download it if needed: "
40+
f"https://docs.privategpt.dev/installation/getting-started/troubleshooting#tokenizer-setup."
41+
f"Falling back to default tokenizer."
4242
)
4343

4444
logger.info("Initializing the LLM in mode=%s", llm_mode)

scripts/setup

+10-6
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ snapshot_download(
2424
repo_id=settings().huggingface.embedding_hf_model_name,
2525
cache_dir=models_cache_path,
2626
local_dir=embedding_path,
27+
token=settings().huggingface.access_token,
2728
)
2829
print("Embedding model downloaded!")
2930

@@ -35,15 +36,18 @@ hf_hub_download(
3536
cache_dir=models_cache_path,
3637
local_dir=models_path,
3738
resume_download=resume_download,
39+
token=settings().huggingface.access_token,
3840
)
3941
print("LLM model downloaded!")
4042

4143
# Download Tokenizer
42-
print(f"Downloading tokenizer {settings().llm.tokenizer}")
43-
AutoTokenizer.from_pretrained(
44-
pretrained_model_name_or_path=settings().llm.tokenizer,
45-
cache_dir=models_cache_path,
46-
)
47-
print("Tokenizer downloaded!")
44+
if settings().llm.tokenizer:
45+
print(f"Downloading tokenizer {settings().llm.tokenizer}")
46+
AutoTokenizer.from_pretrained(
47+
pretrained_model_name_or_path=settings().llm.tokenizer,
48+
cache_dir=models_cache_path,
49+
token=settings().huggingface.access_token,
50+
)
51+
print("Tokenizer downloaded!")
4852

4953
print("Setup done")

settings.yaml

+3-2
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,8 @@ llm:
4040
# Should be matching the selected model
4141
max_new_tokens: 512
4242
context_window: 3900
43-
tokenizer: mistralai/Mistral-7B-Instruct-v0.2
43+
# Select your tokenizer. Llama-index tokenizer is the default.
44+
# tokenizer: mistralai/Mistral-7B-Instruct-v0.2
4445
temperature: 0.1 # The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual. (Default: 0.1)
4546

4647
rag:
@@ -76,7 +77,7 @@ embedding:
7677

7778
huggingface:
7879
embedding_hf_model_name: BAAI/bge-small-en-v1.5
79-
access_token: ${HUGGINGFACE_TOKEN:}
80+
access_token: ${HF_TOKEN:}
8081

8182
vectorstore:
8283
database: qdrant

0 commit comments

Comments
 (0)