Improve Khoj First Run, Docker Setup and Documentation (#919)

## Improve - Intelligently initialize a decent default set of chat model options - Create non-interactive mode. Auto set default server configuration on first run via Docker ## Fix - Make RapidOCR dependency optional as flaky requirements causing docker build failures - Set default openai text to image model correctly during initialization ## Details Improve initialization flow during first run to remove need to configure Khoj: - Set Google, Anthropic Chat models too Previously only Offline, Openai chat models could be set during init - Add multiple chat models for each LLM provider Interactively set a comma separated list of models for each provider - Auto add default chat models for each provider in non-interactive model if the `{OPENAI,GEMINI,ANTHROPIC}_API_KEY' env var is set - Used when server run via Docker as user input cannot be processed to configure server during first run - Do not ask for `max_tokens', `tokenizer' for offline models during initialization. Use better defaults inferred in code instead - Explicitly set default chat model to use If unset, it implicitly defaults to using the first chat model. Make it explicit to reduce this confusion Resolves #882
khoj-ai · Sep 21, 2024 · f00e0e6 · f00e0e6
2 parents 0a56824 + a6c0b43
commit f00e0e6
Show file tree

Hide file tree

Showing 15 changed files with 580 additions and 346 deletions.
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -44,11 +44,20 @@ services:
       - KHOJ_DEBUG=False
       - KHOJ_ADMIN_EMAIL=username@example.com
       - KHOJ_ADMIN_PASSWORD=password
-      # Uncomment the following lines to make your instance publicly accessible.
-      # Replace the domain with your domain. Proceed with caution, especially if you are using anonymous mode.
+      # Uncomment lines below to use chat models by each provider.
+      # Ensure you set your provider specific API keys.
+      # ---
+      # - OPENAI_API_KEY=your_openai_api_key
+      # - GEMINI_API_KEY=your_gemini_api_key
+      # - ANTHROPIC_API_KEY=your_anthropic_api_key
+      # Uncomment the necessary lines below to make your instance publicly accessible.
+      # Replace the KHOJ_DOMAIN with either your domain or IP address (no http/https prefix).
+      # Proceed with caution, especially if you are using anonymous mode.
+      # ---
       # - KHOJ_NO_HTTPS=True
       # - KHOJ_DOMAIN=192.168.0.104
-    command: --host="0.0.0.0" --port=42110 -vv --anonymous-mode
+      # - KHOJ_DOMAIN=khoj.example.com
+    command: --host="0.0.0.0" --port=42110 -vv --anonymous-mode --non-interactive
 
 
 volumes:

diff --git a/documentation/assets/img/example_search_model_admin_settings.png b/documentation/assets/img/example_search_model_admin_settings.png
diff --git a/documentation/docs/advanced/admin.md b/documentation/docs/advanced/admin.md
@@ -0,0 +1,73 @@
+# Admin Panel
+> Describes the Khoj settings configurable via the admin panel
+
+## App Settings
+### Agents
+Add all the agents you want to use for your different use-cases like Writer, Researcher, Therapist etc.
+- `Personality`: This is a prompt to tell the chat model how to tune the personality of the agent.
+- `Chat model`: The chat model to use for the agent.
+- `Name`: The name of the agent. This field helps give the agent a unique identity across the app.
+- `Avatar`: Url to the agents profile picture. It help give the agent a unique visual identity across the app.
+- `Style color`, `Style icon`: These fields help give the agent a unique, visually identifiable identity across the app.
+- `Slug`: This is the agent name to use in urls.
+- `Public`: Check this if the agent is expected to be visible to all users on this Khoj server.
+- `Managed by admin`: Check this if the agent is managed by admin, not by any user.
+- `Creator`: The user who created the agent.
+- `Tools`: The list of tools available to this agent. Tools include notes, image, online. This field is not currently configurable and only supports all tools (i.e `["*"]`)
+
+### Chat Model Options
+Add all the chat models you want to try, use and switch between for your different use-cases. For each chat model you add:
+- `Chat model`: The name of an [OpenAI](https://platform.openai.com/docs/models), [Anthropic](https://docs.anthropic.com/en/docs/about-claude/models#model-names), [Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models) or [Offline](https://huggingface.co/models?pipeline_tag=text-generation&library=gguf) chat model.
+- `Model type`: The chat model provider like `OpenAI`, `Offline`.
+- `Vision enabled`: Set to `true` if your model supports vision. This is currently only supported for vision capable OpenAI models like `gpt-4o`
+- `Max prompt size`, `Subscribed max prompt size`: These are optional fields. They are used to truncate the context to the maximum context size that can be passed to the model. This can help with accuracy and cost-saving.<br />
+- `Tokenizer`: This is an optional field. It is used to accurately count tokens and truncate context passed to the chat model to stay within the models max prompt size.
+  ![example configuration for chat model options](/img/example_chatmodel_option.png)
+
+### Server Chat Settings
+The server chat settings are used as:
+1. The default chat models for subscribed (`Advanced` field) and unsubscribed (`Default` field) users.
+2. The chat model for all intermediate steps like intent detection, web search etc. during chat response generation.
+
+If a server chat setting is not added the first ChatModelOption in your config is used as the default chat model.
+
+To add a server chat setting:
+- Set your preferred default chat models in the `Default` fields of your [ServerChatSettings](http://localhost:42110/server/admin/database/serverchatsettings/)
+- The `Advanced` field doesn't need to be set when self-hosting. When unset, the `Default` chat model is used for all users and the intermediate steps.
+
+
+### OpenAI Processor Conversation Configs
+These settings configure chat model providers to be accessed over API.
+The name of this setting is kind of a misnomer, we know, it'll hopefully be changed at some point.
+For each chat model provider you [add](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add):
+- `Api key`: Set to your [OpenAI](https://platform.openai.com/api-keys), [Anthropic](https://console.anthropic.com/account/keys) or [Gemini](https://aistudio.google.com/app/apikey) API keys.
+- `Name`: Give the configuration any friendly name like `OpenAI`, `Gemini`, `Anthropic`.
+- `Api base url`: Set the API base URL. This is only relevant to set if you're using another OpenAI-compatible proxy server like [Ollama](/advanced/ollama) or [LMStudio](/advanced/lmstudio).
+  ![example configuration for openai processor](/img/example_openai_processor_config.png)
+
+### Search Model Configs
+Search models are used to generate vector embeddings of your documents for natural language search and chat. You can choose any [embeddings models on HuggingFace](https://huggingface.co/models?pipeline_tag=sentence-similarity) to try, use for your to create vector embeddings of your documents for natural language search and chat.
+
+<img src="/img/example_search_model_admin_settings.png" alt="Example Search Model Settings" style={{width: 500}} />
+
+### Text to Image Model Options
+Add text to image generation models with these settings. Khoj currently supports text to image models available via OpenAI, Stability or Replicate API
+- `api-key`: Set to your OpenAI, Stability or Replicate API key
+- `model`: Set the model name available over the selected model provider
+- `model-type`: Set to the appropiate model provider
+- `openai-config`: For image generation models available via OpenAI (compatible) API you can set the appropriate OpenAI Processor Conversation Settings instead of specifying the `api-key` field above
+
+### Speech to Text Model Options
+Add speech to text models with these settings. Khoj currently only supports whisper speech to text model via OpenAI API or Offline
+
+### Voice Model Options
+Add text to speech models with these settings. Khoj currently supports models from [ElevenLabs](https://elevenlabs.io/).
+
+## User Data
+- Users, Entrys, Conversations, Subscriptions, Github configs, Notion configs, User search configs, User conversation configs, User voice configs
+
+## Miscellaneous Data
+- Process Locks: Persistent Locks for Automations
+- Client Applications:
+
+Client applications allow you to setup third party applications that can query your Khoj server using a client application ID + secret. The secret would go in a bearer token.
diff --git a/documentation/docs/advanced/authentication.md b/documentation/docs/advanced/authentication.md
@@ -7,7 +7,7 @@ This is only helpful for self-hosted users or teams. If you're using [Khoj Cloud
 By default, most of the instructions for self-hosting Khoj assume a single user, and so the default configuration is to run in anonymous mode. However, if you want to enable authentication, you can do so either with with [Magic Links](#using-magic-links) or [Google OAuth](#using-google-oauth) as shown below. This can be helpful to make Khoj securely accessible to you and your team.
 
 :::tip[Note]
-Remove the `--anonymous-mode` flag in your start up command to enable authentication.
+Remove the `--anonymous-mode` flag from your khoj start up command or docker-compose file to enable authentication.
 :::
 
 ## Using Magic Links

diff --git a/documentation/docs/advanced/remote.md b/documentation/docs/advanced/remote.md
@@ -0,0 +1,20 @@
+# Remote Access
+
+By default self-hosted Khoj is only accessible on the machine it is running. To securely access it from a remote machine:
+- Set the `KHOJ_DOMAIN` environment variable to your remotely accessible ip or domain via shell or docker-compose.yml.
+  Examples: `KHOJ_DOMAIN=my.khoj-domain.com`, `KHOJ_DOMAIN=192.168.0.4`.
+- Ensure the Khoj Admin password and `KHOJ_DJANGO_SECRET_KEY` environment variable are securely set.
+- Setup [Authentication](/advanced/authentication).
+- Open access to the Khoj port (default: 42110) from your OS and Network firewall.
+
+:::warning[Use HTTPS certificate]
+To expose Khoj on a custom domain over the public internet, use of an SSL certificate is strongly recommended. You can use [Let's Encrypt](https://letsencrypt.org/) to get a free SSL certificate for your domain.
+
+To disable HTTPS, set the `KHOJ_NO_HTTPS` environment variable to `True`. This can be useful if Khoj is only accessible behind a secure, private network.
+:::
+
+:::info[Try Tailscale]
+You can use [Tailscale](https://tailscale.com/) for easy, secure access to your self-hosted Khoj over the network.
+1. Set `KHOJ_DOMAIN` to your machines [tailscale ip](https://tailscale.com/kb/1452/connect-to-devices#identify-your-devices) or [fqdn on tailnet](https://tailscale.com/kb/1081/magicdns#fully-qualified-domain-names-vs-machine-names). E.g `KHOJ_DOMAIN=100.4.2.0` or `KHOJ_DOMAIN=khoj.tailfe8c.ts.net`
+2. Access Khoj by opening `http://tailscale-ip-of-server:42110` or `http://fqdn-of-server:42110` from any device on your tailscale network
+:::