diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index fac23ce56322..e787b9cb68ec 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -306,7 +306,7 @@ Ollama and Ramalama are both options to provide local LLMs, each which requires 2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools): :::warning Limited Support for models without tool calling -Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose. +Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). ::: Example: @@ -397,20 +397,24 @@ If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST= └ Configuration saved successfully ``` +:::tip Context Length +If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 4096 tokens is too low. Set the `OLLAMA_CONTEXT_LENGTH` environment variable to a [higher value](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size). +::: + #### Ramalama 1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install). 2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) : :::warning Limited Support for models without tool calling -Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose. +Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). ::: Example: ```sh # NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider. -ramalama serve --runtime-args="--jinja" ollama://qwen2.5 +ramalama serve --runtime-args="--jinja" --ctx-size=8192 ollama://qwen2.5 ``` 3. In a separate terminal window, configure with Goose: @@ -493,6 +497,11 @@ For the Ollama provider, if you don't provide a host, we set it to `localhost:11 └ Configuration saved successfully ``` +:::tip Context Length +If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 2048 tokens is too low. Use `ramalama serve` to set the `--ctx-size, -c` option to a [higher value](https://github.com/containers/ramalama/blob/main/docs/ramalama-serve.1.md#--ctx-size--c). +::: + + ### DeepSeek-R1 Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally.