Skip to content

feat(provider): auto-detect Ollama context limits#10758

Open
felipemadero wants to merge 2 commits intoanomalyco:devfrom
felipemadero:feat/ollama-context-limit
Open

feat(provider): auto-detect Ollama context limits#10758
felipemadero wants to merge 2 commits intoanomalyco:devfrom
felipemadero:feat/ollama-context-limit

Conversation

@felipemadero
Copy link

@felipemadero felipemadero commented Jan 27, 2026

Fixes #10759

Summary

  • Auto-detect Ollama servers by checking if GET / returns "Ollama is running"
  • Query model context limits via POST /api/show to get num_ctx from Modelfile parameters
  • Config limit.context takes priority if set by user
  • Falls back to 4096 (Ollama's default) if not specified
  • Fix compaction threshold for models without limit.output: reserve 10% of context instead of hardcoded 32000

This enables context percentage display in the status bar for Ollama models without manual configuration, and fixes compaction triggering immediately on small context models.

Query Ollama API to get model context limits (num_ctx) for proper
context percentage display in status bar. Detects Ollama servers
by checking if root endpoint returns "Ollama is running", then
fetches model info via /api/show. Falls back to 4096 default.
@github-actions
Copy link
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions
Copy link
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found the following potentially related PRs:

  1. PR Adding the auto-detection of ollama local with a variable for baseURL #3726 - "Adding the auto-detection of ollama local with a variable for baseURL"

  2. PR feat(opencode): add auto model detection for OpenAI-compatible providers #8359 - "feat(opencode): add auto model detection for OpenAI-compatible providers"

  3. PR fix(session): warn when context window may be too small for tool calling #7422 - "fix(session): warn when context window may be too small for tool calling"

These are the closest matches, but none appear to be exact duplicates of PR #10758. You should review PR #3726 and #8359 for any overlap in Ollama auto-detection implementation logic.

When limit.output is 0, fall back to reserving 20% of context
(capped at OUTPUT_TOKEN_MAX) instead of hardcoded 32000.
This fixes compaction triggering immediately on small context
models like 16k Ollama models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-detect Ollama model context limits

1 participant