From 66d53f093750ede163fa4d95b3ddecfcb8fadd8a Mon Sep 17 00:00:00 2001 From: shelajev Date: Thu, 3 Jul 2025 16:11:43 +0300 Subject: [PATCH 1/4] docs: add documentation for Docker Model Runner provider MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit docs: add documentation for Docker Model Runner provider Signed-off-by: Oleg Šelajev --- .../docs/getting-started/providers.md | 89 ++++++++++++++++++- 1 file changed, 88 insertions(+), 1 deletion(-) diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index fd71cadc1132..a2752003cd16 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -24,6 +24,7 @@ Goose relies heavily on tool calling capabilities and currently works best with | [Anthropic](https://www.anthropic.com/) | Offers Claude, an advanced AI model for natural language tasks. | `ANTHROPIC_API_KEY`, `ANTHROPIC_HOST` (optional) | | [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/) | Access Azure-hosted OpenAI models, including GPT-4 and GPT-3.5. Supports both API key and Azure credential chain authentication. | `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT_NAME`, `AZURE_OPENAI_API_KEY` (optional) | | [Databricks](https://www.databricks.com/) | Unified data analytics and AI platform for building and deploying models. | `DATABRICKS_HOST`, `DATABRICKS_TOKEN` | +| [Docker Model Runner](https://docs.docker.com/ai/model-runner/) | Local models running in Docker Desktop or Docker CE with OpenAI-compatible API endpoints. **Because this provider runs locally, you must first [download a model](/docs/getting-started/providers#docker).** | `OPENAI_HOST`, `OPENAI_BASE_PATH` | | [Gemini](https://ai.google.dev/gemini-api/docs) | Advanced LLMs by Google with multimodal capabilities (text, images). | `GOOGLE_API_KEY` | | [GCP Vertex AI](https://cloud.google.com/vertex-ai) | Google Cloud's Vertex AI platform, supporting Gemini and Claude models. **Credentials must be [configured in advance](https://cloud.google.com/vertex-ai/docs/authentication).** | `GCP_PROJECT_ID`, `GCP_LOCATION` and optional `GCP_MAX_RETRIES` (6), `GCP_INITIAL_RETRY_INTERVAL_MS` (5000), `GCP_BACKOFF_MULTIPLIER` (2.0), `GCP_MAX_RETRY_INTERVAL_MS` (320_000). | | [Groq](https://groq.com/) | High-performance inference hardware and tools for LLMs. | `GROQ_API_KEY` | @@ -292,7 +293,93 @@ To set up Google Gemini with Goose, follow these steps: ### Local LLMs -Ollama and Ramalama are both options to provide local LLMs, each which requires a bit more set up before you can use one of them with Goose. +Docker Model Runner, Ollama, and Ramalama are options to provide local LLMs, each which requires a bit more set up before you can use one of them with Goose. + +#### Docker + +1. [Get Docker](https://docs.docker.com/get-started/get-docker/) +2. [Enable Docker Model Runner](https://docs.docker.com/ai/model-runner/#enable-dmr-in-docker-desktop) +3. [Pull a model](https://docs.docker.com/ai/model-runner/#pull-a-model), for example, from Docker Hub [AI namespace](https://hub.docker.com/u/ai), [Unsloth](https://hub.docker.com/u/unsloth), or [from HuggingFace](https://www.docker.com/blog/docker-model-runner-on-hugging-face/) + +:::warning Limited Support for models without tool calling +Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). +::: + +Example: + +```sh +docker model pull hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k +``` + +4. Configure Goose to use Docker Model Runner, using the OpenAI API compatible endpoint: + +```sh +goose configure +``` + +4. Choose to `Configure Providers` + +``` +┌ goose-configure +│ +◆ What would you like to configure? +│ ● Configure Providers (Change provider or update credentials) +│ ○ Toggle Extensions +│ ○ Add Extension +└ +``` + +5. Choose `OpenAI` as the model provider: + +``` +┌ goose-configure +│ +◇ What would you like to configure? +│ Configure Providers +│ +◆ Which model provider should we use? +│ ○ Anthropic +│ ○ Amazon Bedrock +│ ○ Claude Code +│ ● OpenAI (GPT-4 and other OpenAI models, including OpenAI compatible ones) +│ ○ OpenRouter +``` + +6. Configure Docker Model Runner endpoint as the `OPENAI_HOST`: +┌ goose-configure +│ +◇ What would you like to configure? +│ Configure Providers +│ +◇ Which model provider should we use? +│ OpenAI +│ +◆ Provider OpenAI requires OPENAI_HOST, please enter a value +│ https://api.openai.com (default) +└ + +The default value for the host-side port Docker Model Runner is 12434, so the `OPENAI_HOST` value could be: +`http://localhost:12434`. + +7. Configure the base path: + +``` +◆ Provider OpenAI requires OPENAI_BASE_PATH, please enter a value +│ v1/chat/completions (default) +└ +``` + +Docker model runner uses `/engines/llama.cpp/v1/chat/completions` for the base path. + +8. Finally configure the model available in Docker Model Runner to be used by Goose: `hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k` + +``` +│ +◇ Enter a model from that provider: +│ gpt-4o +│ +◒ Checking your configuration... └ Configuration saved successfully +``` #### Ollama From e6081218ef0f3dc095e889819390096a83ea2d82 Mon Sep 17 00:00:00 2001 From: angiejones Date: Sat, 26 Jul 2025 22:34:26 -0500 Subject: [PATCH 2/4] docs: maintainer review and updates for Docker Model Runner documentation --- .../docs/getting-started/providers.md | 352 +++++++++--------- 1 file changed, 184 insertions(+), 168 deletions(-) diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index a2752003cd16..493758996628 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -293,93 +293,13 @@ To set up Google Gemini with Goose, follow these steps: ### Local LLMs -Docker Model Runner, Ollama, and Ramalama are options to provide local LLMs, each which requires a bit more set up before you can use one of them with Goose. +Goose is a local AI agent, and by using a local LLM, you keep your data private, maintain full control over your environment, and can work entirely offline without relying on cloud access. -#### Docker +Here are some local providers we support: -1. [Get Docker](https://docs.docker.com/get-started/get-docker/) -2. [Enable Docker Model Runner](https://docs.docker.com/ai/model-runner/#enable-dmr-in-docker-desktop) -3. [Pull a model](https://docs.docker.com/ai/model-runner/#pull-a-model), for example, from Docker Hub [AI namespace](https://hub.docker.com/u/ai), [Unsloth](https://hub.docker.com/u/unsloth), or [from HuggingFace](https://www.docker.com/blog/docker-model-runner-on-hugging-face/) - -:::warning Limited Support for models without tool calling -Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). -::: - -Example: - -```sh -docker model pull hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k -``` - -4. Configure Goose to use Docker Model Runner, using the OpenAI API compatible endpoint: - -```sh -goose configure -``` - -4. Choose to `Configure Providers` - -``` -┌ goose-configure -│ -◆ What would you like to configure? -│ ● Configure Providers (Change provider or update credentials) -│ ○ Toggle Extensions -│ ○ Add Extension -└ -``` - -5. Choose `OpenAI` as the model provider: - -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◆ Which model provider should we use? -│ ○ Anthropic -│ ○ Amazon Bedrock -│ ○ Claude Code -│ ● OpenAI (GPT-4 and other OpenAI models, including OpenAI compatible ones) -│ ○ OpenRouter -``` - -6. Configure Docker Model Runner endpoint as the `OPENAI_HOST`: -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◇ Which model provider should we use? -│ OpenAI -│ -◆ Provider OpenAI requires OPENAI_HOST, please enter a value -│ https://api.openai.com (default) -└ - -The default value for the host-side port Docker Model Runner is 12434, so the `OPENAI_HOST` value could be: -`http://localhost:12434`. - -7. Configure the base path: - -``` -◆ Provider OpenAI requires OPENAI_BASE_PATH, please enter a value -│ v1/chat/completions (default) -└ -``` - -Docker model runner uses `/engines/llama.cpp/v1/chat/completions` for the base path. - -8. Finally configure the model available in Docker Model Runner to be used by Goose: `hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k` - -``` -│ -◇ Enter a model from that provider: -│ gpt-4o -│ -◒ Checking your configuration... └ Configuration saved successfully -``` +- [Ollama](#ollama) +- [Ramalama](#ramalama) +- [Docker Model Runner](#docker-model-runner) #### Ollama @@ -387,7 +307,7 @@ Docker model runner uses `/engines/llama.cpp/v1/chat/completions` for the base p 2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools): :::warning Limited Support for models without tool calling -Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose. +Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). ::: Example: @@ -478,6 +398,111 @@ If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST= └ Configuration saved successfully ``` +--- + +#### DeepSeek-R1 + +Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally. +Note that the native `DeepSeek-r1` model doesn't support tool calling, however, we have a [custom model](https://ollama.com/michaelneale/deepseek-r1-goose) you can use with Goose. + +:::warning +Note that this is a 70B model size and requires a powerful device to run smoothly. +::: + + +1. Download and install Ollama from [ollama.com](https://ollama.com/download). +2. In a terminal window, run the following command to install the custom DeepSeek-r1 model: + +```sh +ollama run michaelneale/deepseek-r1-goose +``` + + + + 3. Click `...` in the top-right corner. + 4. Navigate to `Advanced Settings` -> `Browse Models` -> and select `Ollama` from the list. + 5. Enter `michaelneale/deepseek-r1-goose` for the model name. + + + 3. In a separate terminal window, configure with Goose: + + ```sh + goose configure + ``` + + 4. Choose to `Configure Providers` + + ``` + ┌ goose-configure + │ + ◆ What would you like to configure? + │ ● Configure Providers (Change provider or update credentials) + │ ○ Toggle Extensions + │ ○ Add Extension + └ + ``` + + 5. Choose `Ollama` as the model provider + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◆ Which model provider should we use? + │ ○ Anthropic + │ ○ Databricks + │ ○ Google Gemini + │ ○ Groq + │ ● Ollama (Local open source models) + │ ○ OpenAI + │ ○ OpenRouter + └ + ``` + + 5. Enter the host where your model is running + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◆ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + └ + ``` + + 6. Enter the installed model from above + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◇ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + │ + ◇ Enter a model from that provider: + │ michaelneale/deepseek-r1-goose + │ + ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! + │ + └ Configuration saved successfully + ``` + + + +--- + #### Ramalama 1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install). @@ -574,106 +599,97 @@ For the Ollama provider, if you don't provide a host, we set it to `localhost:11 └ Configuration saved successfully ``` -### DeepSeek-R1 +--- -Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally. -Note that the native `DeepSeek-r1` model doesn't support tool calling, however, we have a [custom model](https://ollama.com/michaelneale/deepseek-r1-goose) you can use with Goose. +#### Docker Model Runner -:::warning -Note that this is a 70B model size and requires a powerful device to run smoothly. +1. [Get Docker](https://docs.docker.com/get-started/get-docker/) +2. [Enable Docker Model Runner](https://docs.docker.com/ai/model-runner/#enable-dmr-in-docker-desktop) +3. [Pull a model](https://docs.docker.com/ai/model-runner/#pull-a-model), for example, from Docker Hub [AI namespace](https://hub.docker.com/u/ai), [Unsloth](https://hub.docker.com/u/unsloth), or [from HuggingFace](https://www.docker.com/blog/docker-model-runner-on-hugging-face/) + +:::warning Limited Support for models without tool calling +Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). ::: +Example: -1. Download and install Ollama from [ollama.com](https://ollama.com/download). -2. In a terminal window, run the following command to install the custom DeepSeek-r1 model: +```sh +docker model pull hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k +``` + +4. Configure Goose to use Docker Model Runner, using the OpenAI API compatible endpoint: ```sh -ollama run michaelneale/deepseek-r1-goose +goose configure ``` - - - 3. Click `...` in the top-right corner. - 4. Navigate to `Advanced Settings` -> `Browse Models` -> and select `Ollama` from the list. - 5. Enter `michaelneale/deepseek-r1-goose` for the model name. - - - 3. In a separate terminal window, configure with Goose: +4. Choose to `Configure Providers` - ```sh - goose configure - ``` +``` +┌ goose-configure +│ +◆ What would you like to configure? +│ ● Configure Providers (Change provider or update credentials) +│ ○ Toggle Extensions +│ ○ Add Extension +└ +``` - 4. Choose to `Configure Providers` +5. Choose `OpenAI` as the model provider: - ``` - ┌ goose-configure - │ - ◆ What would you like to configure? - │ ● Configure Providers (Change provider or update credentials) - │ ○ Toggle Extensions - │ ○ Add Extension - └ - ``` +``` +┌ goose-configure +│ +◇ What would you like to configure? +│ Configure Providers +│ +◆ Which model provider should we use? +│ ○ Anthropic +│ ○ Amazon Bedrock +│ ○ Claude Code +│ ● OpenAI (GPT-4 and other OpenAI models, including OpenAI compatible ones) +│ ○ OpenRouter +``` - 5. Choose `Ollama` as the model provider +6. Configure Docker Model Runner endpoint as the `OPENAI_HOST`: - ``` - ┌ goose-configure - │ - ◇ What would you like to configure? - │ Configure Providers - │ - ◆ Which model provider should we use? - │ ○ Anthropic - │ ○ Databricks - │ ○ Google Gemini - │ ○ Groq - │ ● Ollama (Local open source models) - │ ○ OpenAI - │ ○ OpenRouter - └ - ``` +``` +┌ goose-configure +│ +◇ What would you like to configure? +│ Configure Providers +│ +◇ Which model provider should we use? +│ OpenAI +│ +◆ Provider OpenAI requires OPENAI_HOST, please enter a value +│ https://api.openai.com (default) +└ +``` - 5. Enter the host where your model is running +The default value for the host-side port Docker Model Runner is 12434, so the `OPENAI_HOST` value could be: +`http://localhost:12434`. - ``` - ┌ goose-configure - │ - ◇ What would you like to configure? - │ Configure Providers - │ - ◇ Which model provider should we use? - │ Ollama - │ - ◆ Provider Ollama requires OLLAMA_HOST, please enter a value - │ http://localhost:11434 - └ - ``` +7. Configure the base path: - 6. Enter the installed model from above +``` +◆ Provider OpenAI requires OPENAI_BASE_PATH, please enter a value +│ v1/chat/completions (default) +└ +``` - ``` - ┌ goose-configure - │ - ◇ What would you like to configure? - │ Configure Providers - │ - ◇ Which model provider should we use? - │ Ollama - │ - ◇ Provider Ollama requires OLLAMA_HOST, please enter a value - │ http://localhost:11434 - │ - ◇ Enter a model from that provider: - │ michaelneale/deepseek-r1-goose - │ - ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! - │ - └ Configuration saved successfully - ``` - - +Docker model runner uses `/engines/llama.cpp/v1/chat/completions` for the base path. + +8. Finally configure the model available in Docker Model Runner to be used by Goose: `hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k` + +``` +│ +◇ Enter a model from that provider: +│ gpt-4o +│ +◒ Checking your configuration... └ Configuration saved successfully +``` +--- ## Azure OpenAI Credential Chain From a35cb4a21996a45c561906624792fbeddec8f898 Mon Sep 17 00:00:00 2001 From: angiejones Date: Sat, 26 Jul 2025 22:43:27 -0500 Subject: [PATCH 3/4] reset providers.md to main branch version for manual integration --- .../docs/getting-started/providers.md | 312 +++++++----------- 1 file changed, 112 insertions(+), 200 deletions(-) diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index 493758996628..e787b9cb68ec 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -5,6 +5,7 @@ title: Configure LLM Provider import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; +import { PanelLeft } from 'lucide-react'; # Supported LLM Providers @@ -24,9 +25,9 @@ Goose relies heavily on tool calling capabilities and currently works best with | [Anthropic](https://www.anthropic.com/) | Offers Claude, an advanced AI model for natural language tasks. | `ANTHROPIC_API_KEY`, `ANTHROPIC_HOST` (optional) | | [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/) | Access Azure-hosted OpenAI models, including GPT-4 and GPT-3.5. Supports both API key and Azure credential chain authentication. | `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT_NAME`, `AZURE_OPENAI_API_KEY` (optional) | | [Databricks](https://www.databricks.com/) | Unified data analytics and AI platform for building and deploying models. | `DATABRICKS_HOST`, `DATABRICKS_TOKEN` | -| [Docker Model Runner](https://docs.docker.com/ai/model-runner/) | Local models running in Docker Desktop or Docker CE with OpenAI-compatible API endpoints. **Because this provider runs locally, you must first [download a model](/docs/getting-started/providers#docker).** | `OPENAI_HOST`, `OPENAI_BASE_PATH` | | [Gemini](https://ai.google.dev/gemini-api/docs) | Advanced LLMs by Google with multimodal capabilities (text, images). | `GOOGLE_API_KEY` | -| [GCP Vertex AI](https://cloud.google.com/vertex-ai) | Google Cloud's Vertex AI platform, supporting Gemini and Claude models. **Credentials must be [configured in advance](https://cloud.google.com/vertex-ai/docs/authentication).** | `GCP_PROJECT_ID`, `GCP_LOCATION` and optional `GCP_MAX_RETRIES` (6), `GCP_INITIAL_RETRY_INTERVAL_MS` (5000), `GCP_BACKOFF_MULTIPLIER` (2.0), `GCP_MAX_RETRY_INTERVAL_MS` (320_000). | +| [GCP Vertex AI](https://cloud.google.com/vertex-ai) | Google Cloud's Vertex AI platform, supporting Gemini and Claude models. **Credentials must be [configured in advance](https://cloud.google.com/vertex-ai/docs/authentication).** | `GCP_PROJECT_ID`, `GCP_LOCATION` and optionally `GCP_MAX_RATE_LIMIT_RETRIES` (5), `GCP_MAX_OVERLOADED_RETRIES` (5), `GCP_INITIAL_RETRY_INTERVAL_MS` (5000), `GCP_BACKOFF_MULTIPLIER` (2.0), `GCP_MAX_RETRY_INTERVAL_MS` (320_000). | +| [GitHub Copilot](https://docs.github.com/en/copilot/using-github-copilot/ai-models) | Access to GitHub Copilot's chat models including gpt-4o, o1, o3-mini, and Claude models. Uses device code authentication flow for secure access. | Uses GitHub device code authentication flow (no API key needed) | | [Groq](https://groq.com/) | High-performance inference hardware and tools for LLMs. | `GROQ_API_KEY` | | [Ollama](https://ollama.com/) | Local model runner supporting Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms).** | `OLLAMA_HOST` | | [Ramalama](https://ramalama.ai/) | Local model using native [OCI](https://opencontainers.org/) container runtimes, [CNCF](https://www.cncf.io/) tools, and supporting models as OCI artifacts. Ramalama API an compatible alternative to Ollama and can be used with the Goose Ollama provider. Supports Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms).** | `OLLAMA_HOST` | @@ -57,16 +58,18 @@ To configure your chosen provider or see available options, run `goose configure **To update your LLM provider and API key:** - 1. Click the gear on the Goose Desktop toolbar - 1. Click `Advanced Settings` - 1. Under `Models`, click `Configure provider` - 1. Click `Configure` on the LLM provider to update - 1. Add additional configurations (API key, host, etc) then press `submit` + 1. Click the button in the top-left to open the sidebar + 2. Click the `Settings` button on the sidebar + 3. Click the `Models` tab + 4. Click `Configure Providers` + 5. Click `Configure` on the LLM provider to update + 6. Add additional configurations (API key, host, etc) then press `submit` **To change provider model** - 1. Click the gear on the Goose Desktop toolbar - 2. Click `Advanced Settings` - 3. Under `Models`, click `Switch models` + 1. Click the button in the top-left to open the sidebar + 2. Click the `Settings` button on the sidebar + 3. Click the `Models` tab + 4. Click `Switch models` 5. Select a Provider from drop down menu 6. Select a model from drop down menu 7. Press `Select Model` @@ -204,8 +207,8 @@ Goose supports using custom OpenAI-compatible endpoints, which is particularly u - 1. Click `...` in the upper right corner - 2. Click `Advanced Settings` + 1. Click the button in the top-left to open the sidebar + 2. Click the `Settings` button on the sidebar 3. Next to `Models`, click the `browse` link 4. Click the `configure` link in the upper right corner 5. Press the `+` button next to OpenAI @@ -252,10 +255,12 @@ To set up Google Gemini with Goose, follow these steps: **To update your LLM provider and API key:** - 1. Click on the three dots in the top-right corner. - 2. Select `Provider Settings` from the menu. - 2. Choose `Google Gemini` as provider from the list. - 3. Click Edit, enter your API key, and click `Set as Active`. + 1. Click the button in the top-left to open the sidebar. + 2. Click the `Settings` button on the sidebar. + 3. Click the `Models` tab. + 4. Click `Configure Providers` + 5. Choose `Google Gemini` as provider from the list. + 6. Click `Configure`, enter your API key, and click `Submit`. @@ -293,13 +298,7 @@ To set up Google Gemini with Goose, follow these steps: ### Local LLMs -Goose is a local AI agent, and by using a local LLM, you keep your data private, maintain full control over your environment, and can work entirely offline without relying on cloud access. - -Here are some local providers we support: - -- [Ollama](#ollama) -- [Ramalama](#ramalama) -- [Docker Model Runner](#docker-model-runner) +Ollama and Ramalama are both options to provide local LLMs, each which requires a bit more set up before you can use one of them with Goose. #### Ollama @@ -398,125 +397,24 @@ If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST= └ Configuration saved successfully ``` ---- - -#### DeepSeek-R1 - -Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally. -Note that the native `DeepSeek-r1` model doesn't support tool calling, however, we have a [custom model](https://ollama.com/michaelneale/deepseek-r1-goose) you can use with Goose. - -:::warning -Note that this is a 70B model size and requires a powerful device to run smoothly. +:::tip Context Length +If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 4096 tokens is too low. Set the `OLLAMA_CONTEXT_LENGTH` environment variable to a [higher value](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size). ::: - -1. Download and install Ollama from [ollama.com](https://ollama.com/download). -2. In a terminal window, run the following command to install the custom DeepSeek-r1 model: - -```sh -ollama run michaelneale/deepseek-r1-goose -``` - - - - 3. Click `...` in the top-right corner. - 4. Navigate to `Advanced Settings` -> `Browse Models` -> and select `Ollama` from the list. - 5. Enter `michaelneale/deepseek-r1-goose` for the model name. - - - 3. In a separate terminal window, configure with Goose: - - ```sh - goose configure - ``` - - 4. Choose to `Configure Providers` - - ``` - ┌ goose-configure - │ - ◆ What would you like to configure? - │ ● Configure Providers (Change provider or update credentials) - │ ○ Toggle Extensions - │ ○ Add Extension - └ - ``` - - 5. Choose `Ollama` as the model provider - - ``` - ┌ goose-configure - │ - ◇ What would you like to configure? - │ Configure Providers - │ - ◆ Which model provider should we use? - │ ○ Anthropic - │ ○ Databricks - │ ○ Google Gemini - │ ○ Groq - │ ● Ollama (Local open source models) - │ ○ OpenAI - │ ○ OpenRouter - └ - ``` - - 5. Enter the host where your model is running - - ``` - ┌ goose-configure - │ - ◇ What would you like to configure? - │ Configure Providers - │ - ◇ Which model provider should we use? - │ Ollama - │ - ◆ Provider Ollama requires OLLAMA_HOST, please enter a value - │ http://localhost:11434 - └ - ``` - - 6. Enter the installed model from above - - ``` - ┌ goose-configure - │ - ◇ What would you like to configure? - │ Configure Providers - │ - ◇ Which model provider should we use? - │ Ollama - │ - ◇ Provider Ollama requires OLLAMA_HOST, please enter a value - │ http://localhost:11434 - │ - ◇ Enter a model from that provider: - │ michaelneale/deepseek-r1-goose - │ - ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! - │ - └ Configuration saved successfully - ``` - - - ---- - #### Ramalama 1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install). 2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) : :::warning Limited Support for models without tool calling -Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose. +Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). ::: Example: ```sh # NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider. -ramalama serve --runtime-args="--jinja" ollama://qwen2.5 +ramalama serve --runtime-args="--jinja" --ctx-size=8192 ollama://qwen2.5 ``` 3. In a separate terminal window, configure with Goose: @@ -599,97 +497,111 @@ For the Ollama provider, if you don't provide a host, we set it to `localhost:11 └ Configuration saved successfully ``` ---- +:::tip Context Length +If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 2048 tokens is too low. Use `ramalama serve` to set the `--ctx-size, -c` option to a [higher value](https://github.com/containers/ramalama/blob/main/docs/ramalama-serve.1.md#--ctx-size--c). +::: -#### Docker Model Runner -1. [Get Docker](https://docs.docker.com/get-started/get-docker/) -2. [Enable Docker Model Runner](https://docs.docker.com/ai/model-runner/#enable-dmr-in-docker-desktop) -3. [Pull a model](https://docs.docker.com/ai/model-runner/#pull-a-model), for example, from Docker Hub [AI namespace](https://hub.docker.com/u/ai), [Unsloth](https://hub.docker.com/u/unsloth), or [from HuggingFace](https://www.docker.com/blog/docker-model-runner-on-hugging-face/) +### DeepSeek-R1 -:::warning Limited Support for models without tool calling -Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). -::: +Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally. +Note that the native `DeepSeek-r1` model doesn't support tool calling, however, we have a [custom model](https://ollama.com/michaelneale/deepseek-r1-goose) you can use with Goose. -Example: +:::warning +Note that this is a 70B model size and requires a powerful device to run smoothly. +::: -```sh -docker model pull hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k -``` -4. Configure Goose to use Docker Model Runner, using the OpenAI API compatible endpoint: +1. Download and install Ollama from [ollama.com](https://ollama.com/download). +2. In a terminal window, run the following command to install the custom DeepSeek-r1 model: ```sh -goose configure -``` - -4. Choose to `Configure Providers` - -``` -┌ goose-configure -│ -◆ What would you like to configure? -│ ● Configure Providers (Change provider or update credentials) -│ ○ Toggle Extensions -│ ○ Add Extension -└ +ollama run michaelneale/deepseek-r1-goose ``` -5. Choose `OpenAI` as the model provider: + + + 3. Click the button in the top-left to open the sidebar. + 4. Click `Settings` -> `Models` -> `Configure Providers` -> and select `Ollama` from the list. + 5. Enter `michaelneale/deepseek-r1-goose` for the model name. + + + 3. In a separate terminal window, configure with Goose: -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◆ Which model provider should we use? -│ ○ Anthropic -│ ○ Amazon Bedrock -│ ○ Claude Code -│ ● OpenAI (GPT-4 and other OpenAI models, including OpenAI compatible ones) -│ ○ OpenRouter -``` + ```sh + goose configure + ``` -6. Configure Docker Model Runner endpoint as the `OPENAI_HOST`: + 4. Choose to `Configure Providers` -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◇ Which model provider should we use? -│ OpenAI -│ -◆ Provider OpenAI requires OPENAI_HOST, please enter a value -│ https://api.openai.com (default) -└ -``` + ``` + ┌ goose-configure + │ + ◆ What would you like to configure? + │ ● Configure Providers (Change provider or update credentials) + │ ○ Toggle Extensions + │ ○ Add Extension + └ + ``` -The default value for the host-side port Docker Model Runner is 12434, so the `OPENAI_HOST` value could be: -`http://localhost:12434`. + 5. Choose `Ollama` as the model provider -7. Configure the base path: + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◆ Which model provider should we use? + │ ○ Anthropic + │ ○ Databricks + │ ○ Google Gemini + │ ○ Groq + │ ● Ollama (Local open source models) + │ ○ OpenAI + │ ○ OpenRouter + └ + ``` -``` -◆ Provider OpenAI requires OPENAI_BASE_PATH, please enter a value -│ v1/chat/completions (default) -└ -``` + 5. Enter the host where your model is running -Docker model runner uses `/engines/llama.cpp/v1/chat/completions` for the base path. + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◆ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + └ + ``` -8. Finally configure the model available in Docker Model Runner to be used by Goose: `hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k` + 6. Enter the installed model from above -``` -│ -◇ Enter a model from that provider: -│ gpt-4o -│ -◒ Checking your configuration... └ Configuration saved successfully -``` ---- + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◇ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + │ + ◇ Enter a model from that provider: + │ michaelneale/deepseek-r1-goose + │ + ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! + │ + └ Configuration saved successfully + ``` + + ## Azure OpenAI Credential Chain From f39b07a718a52c69eac2746185193c74e0dd8a77 Mon Sep 17 00:00:00 2001 From: angiejones Date: Sun, 27 Jul 2025 00:21:09 -0500 Subject: [PATCH 4/4] docs: add Docker Model Runner provider documentation - maintainer review complete --- .../docs/getting-started/providers.md | 592 ++++++++++-------- 1 file changed, 337 insertions(+), 255 deletions(-) diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index e787b9cb68ec..edf0b7639c0a 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -25,12 +25,13 @@ Goose relies heavily on tool calling capabilities and currently works best with | [Anthropic](https://www.anthropic.com/) | Offers Claude, an advanced AI model for natural language tasks. | `ANTHROPIC_API_KEY`, `ANTHROPIC_HOST` (optional) | | [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/) | Access Azure-hosted OpenAI models, including GPT-4 and GPT-3.5. Supports both API key and Azure credential chain authentication. | `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT_NAME`, `AZURE_OPENAI_API_KEY` (optional) | | [Databricks](https://www.databricks.com/) | Unified data analytics and AI platform for building and deploying models. | `DATABRICKS_HOST`, `DATABRICKS_TOKEN` | +| [Docker Model Runner](https://docs.docker.com/ai/model-runner/) | Local models running in Docker Desktop or Docker CE with OpenAI-compatible API endpoints. **Because this provider runs locally, you must first [download a model](#local-llms).** | `OPENAI_HOST`, `OPENAI_BASE_PATH` | | [Gemini](https://ai.google.dev/gemini-api/docs) | Advanced LLMs by Google with multimodal capabilities (text, images). | `GOOGLE_API_KEY` | | [GCP Vertex AI](https://cloud.google.com/vertex-ai) | Google Cloud's Vertex AI platform, supporting Gemini and Claude models. **Credentials must be [configured in advance](https://cloud.google.com/vertex-ai/docs/authentication).** | `GCP_PROJECT_ID`, `GCP_LOCATION` and optionally `GCP_MAX_RATE_LIMIT_RETRIES` (5), `GCP_MAX_OVERLOADED_RETRIES` (5), `GCP_INITIAL_RETRY_INTERVAL_MS` (5000), `GCP_BACKOFF_MULTIPLIER` (2.0), `GCP_MAX_RETRY_INTERVAL_MS` (320_000). | | [GitHub Copilot](https://docs.github.com/en/copilot/using-github-copilot/ai-models) | Access to GitHub Copilot's chat models including gpt-4o, o1, o3-mini, and Claude models. Uses device code authentication flow for secure access. | Uses GitHub device code authentication flow (no API key needed) | | [Groq](https://groq.com/) | High-performance inference hardware and tools for LLMs. | `GROQ_API_KEY` | -| [Ollama](https://ollama.com/) | Local model runner supporting Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms).** | `OLLAMA_HOST` | -| [Ramalama](https://ramalama.ai/) | Local model using native [OCI](https://opencontainers.org/) container runtimes, [CNCF](https://www.cncf.io/) tools, and supporting models as OCI artifacts. Ramalama API an compatible alternative to Ollama and can be used with the Goose Ollama provider. Supports Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms).** | `OLLAMA_HOST` | +| [Ollama](https://ollama.com/) | Local model runner supporting Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](#local-llms).** | `OLLAMA_HOST` | +| [Ramalama](https://ramalama.ai/) | Local model using native [OCI](https://opencontainers.org/) container runtimes, [CNCF](https://www.cncf.io/) tools, and supporting models as OCI artifacts. Ramalama API an compatible alternative to Ollama and can be used with the Goose Ollama provider. Supports Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](#local-llms).** | `OLLAMA_HOST` | | [OpenAI](https://platform.openai.com/api-keys) | Provides gpt-4o, o1, and other advanced language models. Also supports OpenAI-compatible endpoints (e.g., self-hosted LLaMA, vLLM, KServe). **o1-mini and o1-preview are not supported because Goose uses tool calling.** | `OPENAI_API_KEY`, `OPENAI_HOST` (optional), `OPENAI_ORGANIZATION` (optional), `OPENAI_PROJECT` (optional), `OPENAI_CUSTOM_HEADERS` (optional) | | [OpenRouter](https://openrouter.ai/) | API gateway for unified access to various models with features like rate-limiting management. | `OPENROUTER_API_KEY` | | [Snowflake](https://docs.snowflake.com/user-guide/snowflake-cortex/aisql#choosing-a-model) | Access the latest models using Snowflake Cortex services, including Claude models. **Requires a Snowflake account and programmatic access token (PAT)**. | `SNOWFLAKE_HOST`, `SNOWFLAKE_TOKEN` | @@ -298,241 +299,321 @@ To set up Google Gemini with Goose, follow these steps: ### Local LLMs -Ollama and Ramalama are both options to provide local LLMs, each which requires a bit more set up before you can use one of them with Goose. - -#### Ollama - -1. [Download Ollama](https://ollama.com/download). -2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools): - -:::warning Limited Support for models without tool calling -Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). -::: - -Example: - -```sh -ollama run qwen2.5 -``` - -3. In a separate terminal window, configure with Goose: - -```sh -goose configure -``` - -4. Choose to `Configure Providers` - -``` -┌ goose-configure -│ -◆ What would you like to configure? -│ ● Configure Providers (Change provider or update credentials) -│ ○ Toggle Extensions -│ ○ Add Extension -└ -``` - -5. Choose `Ollama` as the model provider - -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◆ Which model provider should we use? -│ ○ Anthropic -│ ○ Databricks -│ ○ Google Gemini -│ ○ Groq -│ ● Ollama (Local open source models) -│ ○ OpenAI -│ ○ OpenRouter -└ -``` - -5. Enter the host where your model is running - -:::info Endpoint -For Ollama, if you don't provide a host, we set it to `localhost:11434`. -When constructing the URL, we prepend `http://` if the scheme is not `http` or `https`. -If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST=http://{host}:{port}`. -::: - -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◇ Which model provider should we use? -│ Ollama -│ -◆ Provider Ollama requires OLLAMA_HOST, please enter a value -│ http://localhost:11434 -└ -``` - - -6. Enter the model you have running - -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◇ Which model provider should we use? -│ Ollama -│ -◇ Provider Ollama requires OLLAMA_HOST, please enter a value -│ http://localhost:11434 -│ -◇ Enter a model from that provider: -│ qwen2.5 -│ -◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! -│ -└ Configuration saved successfully -``` - -:::tip Context Length -If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 4096 tokens is too low. Set the `OLLAMA_CONTEXT_LENGTH` environment variable to a [higher value](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size). -::: - -#### Ramalama - -1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install). -2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) : +Goose is a local AI agent, and by using a local LLM, you keep your data private, maintain full control over your environment, and can work entirely offline without relying on cloud access. However, please note that local LLMs require a bit more set up before you can use one of them with Goose. :::warning Limited Support for models without tool calling Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). ::: -Example: - -```sh -# NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider. -ramalama serve --runtime-args="--jinja" --ctx-size=8192 ollama://qwen2.5 -``` - -3. In a separate terminal window, configure with Goose: - -```sh -goose configure -``` - -4. Choose to `Configure Providers` - -``` -┌ goose-configure -│ -◆ What would you like to configure? -│ ● Configure Providers (Change provider or update credentials) -│ ○ Toggle Extensions -│ ○ Add Extension -└ -``` - -5. Choose `Ollama` as the model provider since Ramalama is API compatible and can use the Goose Ollama provider - -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◆ Which model provider should we use? -│ ○ Anthropic -│ ○ Databricks -│ ○ Google Gemini -│ ○ Groq -│ ● Ollama (Local open source models) -│ ○ OpenAI -│ ○ OpenRouter -└ -``` - -5. Enter the host where your model is running - -:::info Endpoint -For the Ollama provider, if you don't provide a host, we set it to `localhost:11434`. When constructing the URL, we preprend `http://` if the scheme is not `http` or `https`. Since Ramalama's default port to serve on is 8080, we set `OLLAMA_HOST=http://0.0.0.0:8080` -::: - -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◇ Which model provider should we use? -│ Ollama -│ -◆ Provider Ollama requires OLLAMA_HOST, please enter a value -│ http://0.0.0.0:8080 -└ -``` - - -6. Enter the model you have running - -``` -┌ goose-configure -│ -◇ What would you like to configure? -│ Configure Providers -│ -◇ Which model provider should we use? -│ Ollama -│ -◇ Provider Ollama requires OLLAMA_HOST, please enter a value -│ http://0.0.0.0:8080 -│ -◇ Enter a model from that provider: -│ qwen2.5 -│ -◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! -│ -└ Configuration saved successfully -``` - -:::tip Context Length -If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 2048 tokens is too low. Use `ramalama serve` to set the `--ctx-size, -c` option to a [higher value](https://github.com/containers/ramalama/blob/main/docs/ramalama-serve.1.md#--ctx-size--c). -::: - - -### DeepSeek-R1 - -Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally. -Note that the native `DeepSeek-r1` model doesn't support tool calling, however, we have a [custom model](https://ollama.com/michaelneale/deepseek-r1-goose) you can use with Goose. - -:::warning -Note that this is a 70B model size and requires a powerful device to run smoothly. -::: - +Here are some local providers we support: + + + + + + 1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install). + 2. In a terminal, run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model): + + The `--runtime-args="--jinja"` flag is required for Ramalama to work with the Goose Ollama provider. + + Example: + + ```sh + ramalama serve --runtime-args="--jinja" ollama://qwen2.5 + ``` + + 3. In a separate terminal window, configure with Goose: + + ```sh + goose configure + ``` + + 4. Choose to `Configure Providers` + + ``` + ┌ goose-configure + │ + ◆ What would you like to configure? + │ ● Configure Providers (Change provider or update credentials) + │ ○ Toggle Extensions + │ ○ Add Extension + └ + ``` + + 5. Choose `Ollama` as the model provider since Ramalama is API compatible and can use the Goose Ollama provider + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◆ Which model provider should we use? + │ ○ Anthropic + │ ○ Databricks + │ ○ Google Gemini + │ ○ Groq + │ ● Ollama (Local open source models) + │ ○ OpenAI + │ ○ OpenRouter + └ + ``` + + 6. Enter the host where your model is running + + :::info Endpoint + For the Ollama provider, if you don't provide a host, we set it to `localhost:11434`. When constructing the URL, we preprend `http://` if the scheme is not `http` or `https`. Since Ramalama's default port to serve on is 8080, we set `OLLAMA_HOST=http://0.0.0.0:8080` + ::: + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◆ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://0.0.0.0:8080 + └ + ``` + + + 7. Enter the model you have running + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◇ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://0.0.0.0:8080 + │ + ◇ Enter a model from that provider: + │ qwen2.5 + │ + ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! + │ + └ Configuration saved successfully + ``` + + :::tip Context Length + If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 2048 tokens is too low. Use `ramalama serve` to set the `--ctx-size, -c` option to a [higher value](https://github.com/containers/ramalama/blob/main/docs/ramalama-serve.1.md#--ctx-size--c). + ::: + + + + The native `DeepSeek-r1` model doesn't support tool calling, however, we have a [custom model](https://ollama.com/michaelneale/deepseek-r1-goose) you can use with Goose. + + :::warning + Note that this is a 70B model size and requires a powerful device to run smoothly. + ::: + + + 1. [Download Ollama](https://ollama.com/download). + 2. In a terminal window, run the following command to install the custom DeepSeek-r1 model: + + ```sh + ollama run michaelneale/deepseek-r1-goose + ``` + + 3. In a separate terminal window, configure with Goose: + + ```sh + goose configure + ``` + + 4. Choose to `Configure Providers` + + ``` + ┌ goose-configure + │ + ◆ What would you like to configure? + │ ● Configure Providers (Change provider or update credentials) + │ ○ Toggle Extensions + │ ○ Add Extension + └ + ``` + + 5. Choose `Ollama` as the model provider + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◆ Which model provider should we use? + │ ○ Anthropic + │ ○ Databricks + │ ○ Google Gemini + │ ○ Groq + │ ● Ollama (Local open source models) + │ ○ OpenAI + │ ○ OpenRouter + └ + ``` + + 6. Enter the host where your model is running + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◆ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + └ + ``` + + 7. Enter the installed model from above + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◇ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + │ + ◇ Enter a model from that provider: + │ michaelneale/deepseek-r1-goose + │ + ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! + │ + └ Configuration saved successfully + ``` + + + 1. [Download Ollama](https://ollama.com/download). + 2. In a terminal, run any [model supporting tool-calling](https://ollama.com/search?c=tools) + + Example: + + ```sh + ollama run qwen2.5 + ``` + + 3. In a separate terminal window, configure with Goose: + + ```sh + goose configure + ``` + + 4. Choose to `Configure Providers` + + ``` + ┌ goose-configure + │ + ◆ What would you like to configure? + │ ● Configure Providers (Change provider or update credentials) + │ ○ Toggle Extensions + │ ○ Add Extension + └ + ``` + + 5. Choose `Ollama` as the model provider + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◆ Which model provider should we use? + │ ○ Anthropic + │ ○ Databricks + │ ○ Google Gemini + │ ○ Groq + │ ● Ollama (Local open source models) + │ ○ OpenAI + │ ○ OpenRouter + └ + ``` + + 6. Enter the host where your model is running + + :::info Endpoint + For Ollama, if you don't provide a host, we set it to `localhost:11434`. + When constructing the URL, we prepend `http://` if the scheme is not `http` or `https`. + If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST=http://{host}:{port}`. + ::: + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◆ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + └ + ``` + + + 7. Enter the model you have running + + ``` + ┌ goose-configure + │ + ◇ What would you like to configure? + │ Configure Providers + │ + ◇ Which model provider should we use? + │ Ollama + │ + ◇ Provider Ollama requires OLLAMA_HOST, please enter a value + │ http://localhost:11434 + │ + ◇ Enter a model from that provider: + │ qwen2.5 + │ + ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! + │ + └ Configuration saved successfully + ``` + + :::tip Context Length + If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 4096 tokens is too low. Set the `OLLAMA_CONTEXT_LENGTH` environment variable to a [higher value](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size). + ::: + + + + + + 1. [Get Docker](https://docs.docker.com/get-started/get-docker/) + 2. [Enable Docker Model Runner](https://docs.docker.com/ai/model-runner/#enable-dmr-in-docker-desktop) + 3. [Pull a model](https://docs.docker.com/ai/model-runner/#pull-a-model), for example, from Docker Hub [AI namespace](https://hub.docker.com/u/ai), [Unsloth](https://hub.docker.com/u/unsloth), or [from HuggingFace](https://www.docker.com/blog/docker-model-runner-on-hugging-face/) -1. Download and install Ollama from [ollama.com](https://ollama.com/download). -2. In a terminal window, run the following command to install the custom DeepSeek-r1 model: + Example: -```sh -ollama run michaelneale/deepseek-r1-goose -``` + ```sh + docker model pull hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k + ``` - - - 3. Click the button in the top-left to open the sidebar. - 4. Click `Settings` -> `Models` -> `Configure Providers` -> and select `Ollama` from the list. - 5. Enter `michaelneale/deepseek-r1-goose` for the model name. - - - 3. In a separate terminal window, configure with Goose: + 4. Configure Goose to use Docker Model Runner, using the OpenAI API compatible endpoint: ```sh goose configure ``` - 4. Choose to `Configure Providers` + 5. Choose to `Configure Providers` ``` ┌ goose-configure @@ -544,65 +625,66 @@ ollama run michaelneale/deepseek-r1-goose └ ``` - 5. Choose `Ollama` as the model provider + 6. Choose `OpenAI` as the model provider: ``` - ┌ goose-configure + ┌ goose-configure │ ◇ What would you like to configure? - │ Configure Providers + │ Configure Providers │ ◆ Which model provider should we use? - │ ○ Anthropic - │ ○ Databricks - │ ○ Google Gemini - │ ○ Groq - │ ● Ollama (Local open source models) - │ ○ OpenAI - │ ○ OpenRouter - └ + │ ○ Anthropic + │ ○ Amazon Bedrock + │ ○ Claude Code + │ ● OpenAI (GPT-4 and other OpenAI models, including OpenAI compatible ones) + │ ○ OpenRouter ``` - 5. Enter the host where your model is running + 7. Configure Docker Model Runner endpoint as the `OPENAI_HOST`: ``` - ┌ goose-configure + ┌ goose-configure │ ◇ What would you like to configure? - │ Configure Providers + │ Configure Providers │ ◇ Which model provider should we use? - │ Ollama + │ OpenAI │ - ◆ Provider Ollama requires OLLAMA_HOST, please enter a value - │ http://localhost:11434 + ◆ Provider OpenAI requires OPENAI_HOST, please enter a value + │ https://api.openai.com (default) └ ``` - 6. Enter the installed model from above + The default value for the host-side port Docker Model Runner is 12434, so the `OPENAI_HOST` value could be: + `http://localhost:12434`. + + 8. Configure the base path: + + ``` + ◆ Provider OpenAI requires OPENAI_BASE_PATH, please enter a value + │ v1/chat/completions (default) + └ + ``` + + Docker model runner uses `/engines/llama.cpp/v1/chat/completions` for the base path. + + 9. Finally configure the model available in Docker Model Runner to be used by Goose: `hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k` ``` - ┌ goose-configure - │ - ◇ What would you like to configure? - │ Configure Providers - │ - ◇ Which model provider should we use? - │ Ollama │ - ◇ Provider Ollama requires OLLAMA_HOST, please enter a value - │ http://localhost:11434 - │ ◇ Enter a model from that provider: - │ michaelneale/deepseek-r1-goose - │ - ◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! + │ gpt-4o │ + ◒ Checking your configuration... └ Configuration saved successfully ``` + + ## Azure OpenAI Credential Chain Goose supports two authentication methods for Azure OpenAI: