Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 32 additions & 1 deletion documentation/docs/getting-started/providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ goose is compatible with a wide range of LLM providers, allowing you to choose a
| [ChatGPT Codex](https://chatgpt.com/codex) | Access GPT-5 Codex models optimized for code generation and understanding. **Requires a ChatGPT Plus/Pro subscription.** | No manual key. Uses browser-based OAuth authentication for both CLI and Desktop. |
| [Databricks](https://www.databricks.com/) | Unified data analytics and AI platform for building and deploying models. | `DATABRICKS_HOST`, `DATABRICKS_TOKEN` |
| [Docker Model Runner](https://docs.docker.com/ai/model-runner/) | Local models running in Docker Desktop or Docker CE with OpenAI-compatible API endpoints. **Because this provider runs locally, you must first [download a model](#local-llms).** | `OPENAI_HOST`, `OPENAI_BASE_PATH` |
| [Gemini](https://ai.google.dev/gemini-api/docs) | Advanced LLMs by Google with multimodal capabilities (text, images). | `GOOGLE_API_KEY` |
| [Gemini](https://ai.google.dev/gemini-api/docs) | Advanced LLMs by Google with multimodal capabilities (text, images). Gemini 3 models support configurable [thinking levels](#gemini-3-thinking-levels). | `GOOGLE_API_KEY`, `GEMINI3_THINKING_LEVEL` (optional) |
| [GCP Vertex AI](https://cloud.google.com/vertex-ai) | Google Cloud's Vertex AI platform, supporting Gemini and Claude models. **Credentials must be [configured in advance](https://cloud.google.com/vertex-ai/docs/authentication).** Filters for allowed models by organization policy (if configured). | `GCP_PROJECT_ID`, `GCP_LOCATION` and optionally `GCP_MAX_RATE_LIMIT_RETRIES` (5), `GCP_MAX_OVERLOADED_RETRIES` (5), `GCP_INITIAL_RETRY_INTERVAL_MS` (5000), `GCP_BACKOFF_MULTIPLIER` (2.0), `GCP_MAX_RETRY_INTERVAL_MS` (320_000). |
| [GitHub Copilot](https://docs.github.com/en/copilot/using-github-copilot/ai-models) | Access to AI models from OpenAI, Anthropic, Google, and other providers through GitHub's Copilot infrastructure. **GitHub account with Copilot access required.** | No manual key. Uses [device flow authentication](#github-copilot-authentication) for both CLI and Desktop. |
| [Groq](https://groq.com/) | High-performance inference hardware and tools for LLMs. | `GROQ_API_KEY` |
Expand Down Expand Up @@ -1150,6 +1150,37 @@ Beyond single-model setups, goose supports [multi-model configurations](/docs/gu
- **Lead/Worker Model** - Automatic switching between a lead model for initial turns and a worker model for execution tasks
- **Planning Mode** - Manual planning phase using a dedicated model to create detailed project breakdowns before execution

## Gemini 3 Thinking Levels

Gemini 3 models support configurable thinking levels to balance response latency and reasoning depth:
- **Low** (default) - Faster responses, lighter reasoning
- **High** - Deeper reasoning, higher latency

<Tabs groupId="interface">
<TabItem value="ui" label="goose Desktop" default>
When selecting a Gemini 3 model, a "Thinking Level" dropdown appears automatically. Select your preference and the setting persists across sessions.
</TabItem>

<TabItem value="cli" label="goose CLI">
**Interactive configuration:**

When you run `goose configure` and select a Gemini 3 model, you'll be prompted to choose a thinking level:

```
◆ Select thinking level for Gemini 3:
│ ● Low - Better latency, lighter reasoning
│ ○ High - Deeper reasoning, higher latency
```
</TabItem>
</Tabs>

:::info Priority Order
The thinking level is determined in this order (highest to lowest priority):
1. `request_params.thinking_level` in model configuration (via `GOOSE_PREDEFINED_MODELS`)
2. `GEMINI3_THINKING_LEVEL` environment variable
3. Default value: `low`
:::

---

If you have any questions or need help with a specific provider, feel free to reach out to us on [Discord](https://discord.gg/goose-oss) or on the [goose repo](https://github.com/block/goose).
Expand Down
10 changes: 10 additions & 0 deletions documentation/docs/guides/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ These variables are needed when using custom endpoints, enterprise deployments,
| `GOOSE_PROVIDER__TYPE` | The specific type/implementation of the provider | [See available providers](/docs/getting-started/providers#available-providers) | Derived from GOOSE_PROVIDER |
| `GOOSE_PROVIDER__HOST` | Custom API endpoint for the provider | URL (e.g., "https://api.openai.com") | Provider-specific default |
| `GOOSE_PROVIDER__API_KEY` | Authentication key for the provider | API key string | None |
| `GEMINI3_THINKING_LEVEL` | Sets the [thinking level](/docs/getting-started/providers#gemini-3-thinking-levels) for Gemini 3 models globally | `low`, `high` | `low` |

**Examples**

Expand Down Expand Up @@ -116,6 +117,15 @@ export GOOSE_PREDEFINED_MODELS='[
"context_limit": 500000
}
]'

# Gemini 3 with high thinking level
export GOOSE_PREDEFINED_MODELS='[
{
"name": "gemini-3-pro",
"provider": "google",
"request_params": {"thinking_level": "high"}
}
]'
```

Custom context limits and request parameters are applied when the model is used. Custom context limits are displayed in goose CLI's [token usage indicator](/docs/guides/sessions/smart-context-management#token-usage).
Expand Down
Loading