block · angiejones · Jul 24, 2025 · Jul 24, 2025 · Jul 23, 2025 · Jul 24, 2025
@@ -306,7 +306,7 @@ Ollama and Ramalama are both options to provide local LLMs, each which requires
 2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools):
 
 :::warning Limited Support for models without tool calling
-Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose.
+Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions).
 :::
 
 Example:
@@ -397,20 +397,24 @@ If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST=
 └  Configuration saved successfully
 ```
 
+:::tip Context Length
+If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 4096 tokens is too low. Set the `OLLAMA_CONTEXT_LENGTH` environment variable to a [higher value](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size). 
+:::
+
 #### Ramalama
 
 1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install).
 2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) :
 
 :::warning Limited Support for models without tool calling
-Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose.
+Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions).
 :::
 
 Example:
 
 ```sh
 # NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider.
-ramalama serve --runtime-args="--jinja" ollama://qwen2.5
+ramalama serve --runtime-args="--jinja" --ctx-size=8192 ollama://qwen2.5
 ```
 
 3. In a separate terminal window, configure with Goose:
@@ -493,6 +497,11 @@ For the Ollama provider, if you don't provide a host, we set it to `localhost:11
 └  Configuration saved successfully
 ```
 
+:::tip Context Length
+If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 2048 tokens is too low. Use `ramalama serve` to set the `--ctx-size, -c` option to a [higher value](https://github.com/containers/ramalama/blob/main/docs/ramalama-serve.1.md#--ctx-size--c). 
+:::
+
+
 ### DeepSeek-R1
 
 Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally.