Use Canonical Models to set context window sizes by katzdave · Pull Request #6723 · block/goose

katzdave · 2026-01-26T21:26:05Z

Summary

Replace context_limits across all providers with data from canonical model if available.

API change of adding provider to model config hits a lot of files.
Also nesting the fast model config, so we actually use the correct context limits for that; There's already a fallback in complete_fast to use the regular model if it fails.
Replace pricing api with a more general canonical model fetch. Will layer more info there as the client needs.

Going to do one more push down this path after and try to nuke this KNOWN_MODELS concept completely from both the server and client.

…ovider * 'main' of github.com:block/goose: increase worker threads for ci (#6614) docs: todo tutorial update (#6613) Added goose doc map md file for goose agent to find relevant doc easily. (#6598) add back goose branding to home (#6617) fix: actually set the working dir for extensions from session (#6612) Multi chat (#6428) Lifei/fixed accumulated token count (#6587) Dont show MCP UI/Apps until tool is approved (#6492) docs: max tokens config (#6596) User configurable templates (#6420) docs: http proxy environment variables (#6594) feat: exclude subagent tool from code_execution filtering (#6531)

…ovider * 'main' of github.com:block/goose: PR Code Review (#6043) fix(docs): use dynamic import for globby ESM module (#6636) chore: trigger CI Document tab completion (#6635) Install goose-mcp crate dependencies (#6632) feat(goose): standardize agent-session-id for session correlation (#6626) chore: tweak release docs (#6571) fix(goose): propagate session_id across providers and MCP (#6584)

…ovider * 'main' of github.com:block/goose: fix: Manual compaction does not update context window. (#6682) Removed the Acceptable Usage Policy (#6204) Document spellcheck toggle (#6721) fix: docs workflow cleanup and prevent cancellations (#6713) Docs: file bug directly (#6718) fix: dispatch ADD_ACTIVE_SESSION event before navigating from "View All" (#6679) Speed up Databricks provider init by removing fetch of supported models (#6616) fix: correct typos in documentation and Justfile (#6686) docs: frameDomains and baseUriDomains for mcp apps (#6684)

…o dkatz/canonical-context * 'dkatz/canonical-provider' of github.com:block/goose: (27 commits) docs: add Remotion video creation tutorial (#6675) docs: export recipe and copy yaml (#6680) Test against fastmcp (#6666) docs: mid-session changes (#6672) Fix MCP elicitation deadlock and improve UX (#6650) chore: upgrade to rmcp 0.14.0 (#6674) [docs] add MCP-UI to MCP Apps blog (#6664) ACP get working dir from args.cwd (#6653) Optimise load config in UI (#6662) Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187) fix: macOS keychain infinite prompt loop (#6620) chore: reduce duplicate or unused cargo deps (#6630) feat: codex subscription support (#6600) smoke test allow pass for flaky providers (#6638) feat: Add built-in skill for goose documentation reference (#6534) Native images (#6619) docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) ...

Copilot

Pull request overview

Copilot reviewed 51 out of 52 changed files in this pull request and generated 4 comments.

Copilot · 2026-02-12T18:18:03Z

crates/goose/src/providers/formats/anthropic.rs

+    let max_tokens = model_config.max_output_tokens();
    let mut payload = json!({
        "model": model_config.model_name,
        "messages": anthropic_messages,
        "max_tokens": max_tokens,
    });


Switching Anthropic to model_config.max_output_tokens() changes the fallback from the previous Claude-specific defaults (often 64k/32k) to the global 4096 when canonical output limits are missing, which can severely truncate responses for models not in the canonical registry; consider preserving the prior Anthropic heuristic as a fallback when max_tokens is unset and canonical output is unavailable.

Copilot · 2026-02-12T18:18:03Z

crates/goose-server/src/routes/config_management.rs

-        assert_eq!(gpt4_limit.unwrap().context_limit, 128_000);
-    }
-}
+mod tests {}


The previous config route tests were removed and replaced with an empty mod tests {} even though this PR adds a new /config/canonical-model-info endpoint; add coverage for at least the 200 path (known model) and 404 path (unknown model) to prevent regressions.

Suggested change

mod tests {}

mod tests {

use super::*;

use std::sync::Arc;

use axum::body::Body;

use axum::http::{Request, StatusCode};

use serde_json::json;

use tower::ServiceExt; // for `oneshot`

fn test_app() -> Router {

// Construct a test instance of the app router with a default AppState.

let state = Arc::new(AppState::default());

routes(state)

}

#[tokio::test]

async fn canonical_model_info_known_model_returns_200() {

let app = test_app();

// Use a model name that is expected to be treated as a known/valid model.

let body = serde_json::to_vec(&json!({ "model": "known-model" })).unwrap();

let request = Request::builder()

.method("POST")

.uri("/config/canonical-model-info")

.header("content-type", "application/json")

.body(Body::from(body))

.unwrap();

let response = app.oneshot(request).await.unwrap();

assert_eq!(response.status(), StatusCode::OK);

}

#[tokio::test]

async fn canonical_model_info_unknown_model_returns_404() {

let app = test_app();

// Use a clearly invalid/unknown model name to trigger the 404 path.

let body = serde_json::to_vec(&json!({ "model": "this-model-should-not-exist" })).unwrap();

let request = Request::builder()

.method("POST")

.uri("/config/canonical-model-info")

.header("content-type", "application/json")

.body(Body::from(body))

.unwrap();

let response = app.oneshot(request).await.unwrap();

assert_eq!(response.status(), StatusCode::NOT_FOUND);

}

}

Copilot · 2026-02-12T18:18:04Z

crates/goose/src/model.rs

+    pub fn with_fast(
+        mut self,
+        fast_model_name: &str,
+        provider_name: &str,
+    ) -> Result<Self, ConfigError> {
+        // Create a full ModelConfig for the fast model with proper canonical lookup
+        let fast_config = ModelConfig::new(fast_model_name)?.with_canonical_limits(provider_name);
+        self.fast_model_config = Some(Box::new(fast_config));
+        Ok(self)


with_fast() builds the fast model config via ModelConfig::new(...), so any settings already applied to the main config (e.g., temperature, toolshim, toolshim_model, request_params, max_tokens overrides) won't carry over to the fast model when complete_fast() calls use_fast_model(); consider deriving the fast config from self.clone() and only overriding the model identity/limits (or merging fast-model limits into a cloned base config).

Copilot · 2026-02-12T18:18:04Z

crates/goose/src/model.rs

-            config.model_name = fast_model.clone();
-            config
+        if let Some(fast_config) = &self.fast_model_config {
+            *fast_config.clone()


use_fast_model() returns the nested fast_model_config as-is, which can drop runtime/session configuration from the primary ModelConfig (e.g., temperature/toolshim/request params) and cause fast completions to behave differently beyond just model name/limits; consider returning a clone of self with only the model name + canonical limits replaced, or explicitly merging the fast config into the base config.

Suggested change

*fast_config.clone()

// Start from the current configuration and override only the model identity

// and canonical limits from the fast model, preserving runtime/session settings.

let mut merged = self.clone();

merged.model = fast_config.model.clone();

merged.context_limit = fast_config.context_limit;

merged.max_tokens = fast_config.max_tokens;

merged

DOsinga

have a look at some of what copilot says, but let's get this in

DOsinga · 2026-02-11T14:34:02Z

crates/goose-acp/src/server.rs

-                goose::model::ModelConfig::new(&model_id)?
+                let provider_name = config
+                    .get_goose_provider()
+                    .map_err(|_| anyhow::anyhow!("No provider configured"))?;


here and elsewhere do we need to map the error? we're just bubbling the other things up

DOsinga · 2026-02-11T17:05:35Z

crates/goose-server/src/routes/config_management.rs

 #[utoipa::path(
    post,
-    path = "/config/pricing",
-    request_body = PricingQuery,
+    path = "/config/canonical-model-info",
+    request_body = ModelInfoQuery,
    responses(
-        (status = 200, description = "Model pricing data retrieved successfully", body = PricingResponse)
+        (status = 200, description = "Model information retrieved successfully", body = ModelInfoResponse),
+        (status = 404, description = "Model not found in canonical registry")
    )
 )]


yeah, let's remove the 404 handler

…ntext * 'main' of github.com:block/goose: feat: add onFallbackRequest handler to McpAppRenderer (#7208) feat: add streaming support for Claude Code CLI provider (#6833) fix: The detected filetype is PLAIN_TEXT, but the provided filetype was HTML (#6885) Add prompts (#7212) Add testing instructions for speech to text (#7185) Diagnostic files copying (#7209) fix: allow concurrent tool execution within the same MCP extension (#7202) fix: handle missing arguments in MCP tool calls to prevent GUI crash (#7143) Filter Apps page to only show standalone Goose Apps (#6811) opt: use static for Regex (#7205) nit: show dir in title, and less... jank (#7138) feat(gemini-cli): use stream-json output and re-use session (#7118) chore(deps): bump qs from 6.14.1 to 6.14.2 in /documentation (#7191) Switch jsonwebtoken to use aws-lc-rs (already used by rustls) (#7189) chore(deps): bump qs from 6.14.1 to 6.14.2 in /evals/open-model-gym/mcp-harness (#7184) Add SLSA build provenance attestations to release workflows (#7097) fix save and run recipe not working (#7186) Upgraded npm packages for latest security updates (#7183) docs: reasoning effort levels for Codex provider (#6798)

Copilot

Pull request overview

Copilot reviewed 64 out of 65 changed files in this pull request and generated 3 comments.

Copilot · 2026-02-13T19:45:27Z

ui/desktop/src/api/client/client.gen.ts

      let data: any;
      switch (parseAs) {
        case 'arrayBuffer':
        case 'blob':
        case 'formData':
+        case 'json':
        case 'text':
          data = await response[parseAs]();
          break;


response.json() will throw if the server returns HTTP 200 with an empty body and no Content-Length: 0 header (a case the removed fallback previously handled), which can turn a successful response into a client-side error; consider restoring the empty-body-safe JSON parsing path (read as text, parse if non-empty, else {}) or broadening the empty-body detection to also handle empty text bodies.

Copilot · 2026-02-13T19:45:28Z

ui/desktop/src/hooks/useCostTracking.ts

          const prevInputCost =
-            (sessionInputTokens || localInputTokens) * (prevCostInfo.input_token_cost || 0);
+            ((sessionInputTokens || localInputTokens) * (prevCostInfo.input_token_cost || 0)) /
+            1_000_000;
          const prevOutputCost =
-            (sessionOutputTokens || localOutputTokens) * (prevCostInfo.output_token_cost || 0);
+            ((sessionOutputTokens || localOutputTokens) * (prevCostInfo.output_token_cost || 0)) /
+            1_000_000;


Using sessionInputTokens || localInputTokens (and the similar output/token assignments) treats 0 as “missing” and will incorrectly fall back when the true token count is zero; use ?? (if values can be null/undefined) or an explicit check (e.g., sessionInputTokens > 0 ? ... : ...) so zero is preserved.

Copilot · 2026-02-13T19:45:28Z

crates/goose/src/model.rs

+    pub fn with_canonical_limits(mut self, provider_name: &str) -> Self {
+        if self.context_limit.is_none() || self.max_tokens.is_none() {
+            if let Some(canonical) = crate::providers::canonical::maybe_get_canonical_model(
+                provider_name,
+                &self.model_name,
+            ) {
+                if self.context_limit.is_none() {
+                    self.context_limit = Some(canonical.limit.context);
+                }
+                if self.max_tokens.is_none() {
+                    self.max_tokens = canonical.limit.output.map(|o| o as i32);
+                }
            }
        }
-        if let Ok(val) = std::env::var("GOOSE_CONTEXT_LIMIT") {
-            return Self::validate_context_limit(&val, "GOOSE_CONTEXT_LIMIT").map(Some);
-        }
-
-        // Get the model's limit
-        let model_limit = Self::get_model_specific_limit(model_name);
-
-        // If there's a fast_model, get its limit and use the minimum
-        if let Some(fast_model_name) = fast_model {
-            let fast_model_limit = Self::get_model_specific_limit(fast_model_name);

-            // Return the minimum of both limits (if both exist)
-            match (model_limit, fast_model_limit) {
-                (Some(m), Some(f)) => Ok(Some(m.min(f))),
-                (Some(m), None) => Ok(Some(m)),
-                (None, Some(f)) => Ok(Some(f)),
-                (None, None) => Ok(None),
+        // Try filling remaining gaps from predefined models
+        if self.context_limit.is_none() {
+            if let Some(pm) = find_predefined_model(&self.model_name) {
+                self.context_limit = pm.context_limit;
            }
-        } else {
-            Ok(model_limit)
        }
+
+        self
    }


with_canonical_limits and max_output_tokens() now affect request shaping (e.g., provider payload max_tokens) but there are no unit tests covering the canonical-registry fill behavior; add a focused test that uses a known bundled canonical model to assert context/output limits get populated as expected.

…ntext * 'main' of github.com:block/goose: Move platform extensions into their own folder (#7210) fix: ignore deprecated skills extension (#7139) Add a goosed over HTTP integration test, and test the developer tool PATH (#7178)

Copilot

Pull request overview

Copilot reviewed 52 out of 53 changed files in this pull request and generated 1 comment.

Copilot · 2026-02-13T23:02:33Z

crates/goose/src/providers/base.rs

+                .map(|&model_name| ModelInfo {
+                    name: model_name.to_string(),
+                    context_limit: ModelConfig::new_or_fail(model_name)
+                        .with_canonical_limits(name)


The canonical limits lookup is being called with the provider name (name) instead of the model name (model_name). This will always fail to find canonical model data because it's passing the provider name as both the provider and model parameters to with_canonical_limits.

Suggested change

.with_canonical_limits(name)

.with_canonical_limits(model_name)

Copilot

Pull request overview

Copilot reviewed 54 out of 55 changed files in this pull request and generated no new comments.

…dd-extension * 'main' of github.com:block/goose: Use Canonical Models to set context window sizes (#6723)

* origin/main: (263 commits) working_dir usage more clear in add_extension (block#6958) Use Canonical Models to set context window sizes (block#6723) Set up direnv and update flake inputs (block#6526) fix: restore subagent tool call notifications after summon refactor (block#7243) fix(ui): preserve server config values on partial provider config save (block#7248) fix(claude-code): allow goose to run inside a Claude Code session (block#7232) fix(openai): route gpt-5 codex via responses and map base paths (block#7254) feat: add GoosePlatform to AgentConfig and MCP initialization (block#6931) Fix copied over (block#7270) feat(gemini-cli): add streaming support via stream-json events (block#7244) fix: filter models without tool support from recommended list (block#7198) fix(google): handle more thoughtSignature vagaries during streaming (block#7204) docs: playwright CLI skill tutorial (block#7261) install node in goose dir (block#7220) fix: relax test_basic_response assertion for providers returning reasoning_content (block#7249) fix: handle reasoning_content for Kimi/thinking models (block#7252) feat: sandboxing for macos (block#7197) fix(otel): use monotonic_counter prefix and support temporality env var (block#7234) Streaming markdown (block#7233) Improve compaction messages to enable better post-compaction agent behavior (block#7259) ... # Conflicts: # crates/goose/src/providers/openai.rs

…ions-fallback * 'main' of github.com:block/goose: (43 commits) Added cmd to validate bundled extensions json (#7217) working_dir usage more clear in add_extension (#6958) Use Canonical Models to set context window sizes (#6723) Set up direnv and update flake inputs (#6526) fix: restore subagent tool call notifications after summon refactor (#7243) fix(ui): preserve server config values on partial provider config save (#7248) fix(claude-code): allow goose to run inside a Claude Code session (#7232) fix(openai): route gpt-5 codex via responses and map base paths (#7254) feat: add GoosePlatform to AgentConfig and MCP initialization (#6931) Fix copied over (#7270) feat(gemini-cli): add streaming support via stream-json events (#7244) fix: filter models without tool support from recommended list (#7198) fix(google): handle more thoughtSignature vagaries during streaming (#7204) docs: playwright CLI skill tutorial (#7261) install node in goose dir (#7220) fix: relax test_basic_response assertion for providers returning reasoning_content (#7249) fix: handle reasoning_content for Kimi/thinking models (#7252) feat: sandboxing for macos (#7197) fix(otel): use monotonic_counter prefix and support temporality env var (#7234) Streaming markdown (#7233) ... # Conflicts: # crates/goose/src/config/extensions.rs

* origin/main: (49 commits) chore: show important keys for provider configuration (#7265) fix: subrecipe relative path with summon (#7295) fix extension selector not displaying the correct enabled extensions (#7290) Use the working dir from the session (#7285) Fix: Minor logging uplift for debugging of prompt injection mitigation (#7195) feat(otel): make otel logging level configurable (#7271) docs: add documentation for Top Of Mind extension (#7283) Document gemini 3 thinking levels (#7282) docs: stream subagent tool calls (#7280) Docs: delete custom provider in desktop (#7279) Everything is streaming (#7247) openai: responses models and hardens event streaming handling (#6831) docs: disable ai session naming (#7194) Added cmd to validate bundled extensions json (#7217) working_dir usage more clear in add_extension (#6958) Use Canonical Models to set context window sizes (#6723) Set up direnv and update flake inputs (#6526) fix: restore subagent tool call notifications after summon refactor (#7243) fix(ui): preserve server config values on partial provider config save (#7248) fix(claude-code): allow goose to run inside a Claude Code session (#7232) ...

katzdave added 23 commits January 20, 2026 16:05

first pass

9268307

lots of improvements

ed68894

another big wave

a5e807e

Test fix

f327622

:fix cost v1

56dd2a5

actuall fix cost now

a8d202c

some docs fixes

2c19b24

fix cost display

078566b

add recommeded to diff

87ce2eb

normalization fix for openrouter + regen

248e69c

fix databricks model support

cc848de

get rid of deduping logic

4500c26

fmt

d93c35d

fix build/tests and rm lots of comments

ca1a514

fmt

edbe3ed

copilot suggestions

03c3f5b

fmt

88a7ec5

merge

582d9cd

context hookin

8097f0f

katzdave changed the title ~~Dkatz/canonical context~~ Use Canonical Models to set context window sizes Jan 26, 2026

big refactor to canonical lookup

b3bb63f

pato1987 approved these changes Jan 27, 2026

View reviewed changes

katzdave added 4 commits January 27, 2026 11:39

update test script

eeac1f7

big client - server change

ee14662

fast model fix

0d22e12

fmt

115ca7c

Copilot started reviewing on behalf of katzdave February 12, 2026 18:13 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

DOsinga approved these changes Feb 13, 2026

View reviewed changes

katzdave added 5 commits February 13, 2026 12:26

fix html

b69f842

clean up errors slightly

1cd6379

openapi

eefa79d

mcp app error

40d4450

Copilot AI review requested due to automatic review settings February 13, 2026 19:39

Copilot started reviewing on behalf of katzdave February 13, 2026 19:40 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

katzdave added 3 commits February 13, 2026 16:40

rm unrelated renderer change

2070389

fix openapi schema

1e8f5d5

Copilot AI review requested due to automatic review settings February 13, 2026 22:58

Copilot started reviewing on behalf of katzdave February 13, 2026 22:59 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

katzdave added 2 commits February 17, 2026 10:58

hopefully fix test timeout

79a7859

fmt

cbec3d1

Copilot AI review requested due to automatic review settings February 17, 2026 16:32

Copilot started reviewing on behalf of katzdave February 17, 2026 16:34 View session

Copilot AI reviewed Feb 17, 2026

View reviewed changes

katzdave added this pull request to the merge queue Feb 17, 2026

Merged via the queue into main with commit 3959805 Feb 17, 2026
30 of 31 checks passed

katzdave deleted the dkatz/canonical-context branch February 17, 2026 16:48

zanesq added a commit that referenced this pull request Feb 17, 2026

Merge branch 'main' of github.com:block/goose into zane/working-dir-a…

ab97c8a

…dd-extension * 'main' of github.com:block/goose: Use Canonical Models to set context window sizes (#6723)

-mod tests {}
+mod tests {
+    use super::*;
+    use std::sync::Arc;
+    use axum::body::Body;
+    use axum::http::{Request, StatusCode};
+    use serde_json::json;
+    use tower::ServiceExt; // for `oneshot`
+    fn test_app() -> Router {
+        // Construct a test instance of the app router with a default AppState.
+        let state = Arc::new(AppState::default());
+        routes(state)
+    }
+    #[tokio::test]
+    async fn canonical_model_info_known_model_returns_200() {
+        let app = test_app();
+        // Use a model name that is expected to be treated as a known/valid model.
+        let body = serde_json::to_vec(&json!({ "model": "known-model" })).unwrap();
+        let request = Request::builder()
+            .method("POST")
+            .uri("/config/canonical-model-info")
+            .header("content-type", "application/json")
+            .body(Body::from(body))
+            .unwrap();
+        let response = app.oneshot(request).await.unwrap();
+        assert_eq!(response.status(), StatusCode::OK);
+    }
+    #[tokio::test]
+    async fn canonical_model_info_unknown_model_returns_404() {
+        let app = test_app();
+        // Use a clearly invalid/unknown model name to trigger the 404 path.
+        let body = serde_json::to_vec(&json!({ "model": "this-model-should-not-exist" })).unwrap();
+        let request = Request::builder()
+            .method("POST")
+            .uri("/config/canonical-model-info")
+            .header("content-type", "application/json")
+            .body(Body::from(body))
+            .unwrap();
+        let response = app.oneshot(request).await.unwrap();
+        assert_eq!(response.status(), StatusCode::NOT_FOUND);
+    }
+}

-            *fast_config.clone()
+            // Start from the current configuration and override only the model identity
+            // and canonical limits from the fast model, preserving runtime/session settings.
+            let mut merged = self.clone();
+            merged.model = fast_config.model.clone();
+            merged.context_limit = fast_config.context_limit;
+            merged.max_tokens = fast_config.max_tokens;
+            merged

	.with_canonical_limits(name)
	.with_canonical_limits(model_name)

Conversation

katzdave commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga left a comment

Choose a reason for hiding this comment

Uh oh!

DOsinga Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

katzdave commented Jan 26, 2026 •

edited

Loading