feat: add local inference provider with llama.cpp backend and HuggingFace model management by DOsinga · Pull Request #6933 · block/goose

DOsinga · 2026-02-03T23:36:18Z

Summary

Adds a local inference provider that enables running language models directly on-device using llama.cpp, with full integration across CLI, server, and desktop UI.

Core Changes

Local inference provider (crates/goose/src/providers/local_inference/): New provider with modular architecture split into submodules:
- inference_native_tools: Inference path for models with native tool-calling support
- inference_emulated_tools: Inference path for models without native tool support, using text-based $ command and ````execute` block detection
- inference_engine: Shared inference primitives — context management, sampling, and token generation
- tool_parsing: Parsing tool calls from model output (JSON and XML formats)
- hf_models: HuggingFace Hub model search and GGUF file discovery
- local_model_registry: Local model download and lifecycle management
Server routes (crates/goose-server/src/routes/local_inference.rs): REST endpoints for model download, listing, deletion, and status
CLI integration (crates/goose-cli/): New local-models subcommand for managing downloaded models, plus provider selection support
Desktop UI (ui/desktop/src/components/):
- LocalModelSetup.tsx: First-run setup flow for downloading models
- LocalInferenceSettings.tsx: Settings panel for model management
- HuggingFaceModelSearch.tsx: Search and download models from HuggingFace
- ModelSettingsPanel.tsx: Per-model configuration

Additional Changes

Tiny model system prompt for resource-constrained local models
Session name generation support for local inference conversations
Download progress display improvements

Known Limitations (deferred to future iterations)

Memory management is best-effort: estimate_max_context_for_memory uses 50% of available memory for KV cache, which is a rough heuristic. available_inference_memory_bytes picks the max of any single accelerator device, which may not be correct for multi-GPU setups or unified memory architectures. On unified-memory Macs, free memory fluctuates with system load, so a model that fits during estimation may OOM during inference. There is currently no fallback or recovery for OOM — it will crash the process.
Model unloading strategy is naive: When loading a new model, all other models are unloaded. This is fine for single-model usage but the code structure (HashMap of model slots) suggests multi-model was considered. A smarter eviction strategy (e.g. LRU) could be added if multi-model support is needed.
Featured models list is hardcoded: The FEATURED_MODELS constant in local_model_registry.rs is a static list of HuggingFace model specs. There is no mechanism to update it without a code change — it will go stale as newer/better models are released. A future iteration could fetch this list from a remote config or allow user customization.

Copilot

Pull request overview

This PR introduces support for running local LLMs via Candle-backed inference, including backend provider integration, REST endpoints for managing model downloads, and desktop UI to configure and select local models.

Changes:

Add a new local provider in the core goose crate that loads quantized GGUF models with architecture-aware handling (Llama, Phi, Phi-3) and custom chat templates, plus support for streaming responses.
Expose new server routes and OpenAPI definitions for listing local models, starting/canceling downloads, querying download progress, and deleting downloaded models, reusing the shared download manager.
Extend the desktop UI and API client to manage local LLM downloads, display progress, select an active local model, and surface dedicated guidance when the local provider is selected.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`ui/desktop/src/components/settings/models/subcomponents/SwitchModelModal.tsx`	Adds a special informational panel when the `local` provider is selected, guiding users to the Models settings to download local models instead of showing generic provider error UI.
`ui/desktop/src/components/settings/models/ModelsSection.tsx`	Embeds the new `LocalInferenceSettings` card into the Models settings page so users can manage local LLMs alongside cloud providers.
`ui/desktop/src/components/settings/localInference/LocalInferenceSettings.tsx`	New React component for listing available local LLMs, starting/canceling downloads with progress visualization, and persisting the selected local model and provider in config.
`ui/desktop/src/components/settings/dictation/LocalModelManager.tsx`	Adjusts the dictation local model manager’s “show all / recommended only” logic and toggle label to match the new local LLM UI behavior.
`ui/desktop/src/api/types.gen.ts`	Extends generated types with `LocalLlmModel`, `LocalModelResponse`, `ModelTier`, and request/response types for the `/local-inference/models` endpoints.
`ui/desktop/src/api/sdk.gen.ts`	Adds typed client functions (`listLocalModels`, `downloadLocalModel`, `getLocalModelDownloadProgress`, `cancelLocalModelDownload`, `deleteLocalModel`) for the new local inference endpoints.
`ui/desktop/src/api/index.ts`	Re-exports the new local inference client functions and types so the UI can consume them through the existing API barrel.
`ui/desktop/openapi.json`	Updates the desktop-copied OpenAPI spec with the new `/local-inference/models` paths, schemas for `LocalLlmModel`, `LocalModelResponse`, and `ModelTier`, and ties them into the existing components section.
`crates/goose/src/providers/mod.rs`	Registers the new `local_inference` provider module in the providers namespace.
`crates/goose/src/providers/local_inference.rs`	Implements the `LocalInferenceProvider` with model metadata, recommendation logic, GGUF loading for Llama/Phi/Phi-3, prompt templating, and both non-streaming and streaming completion APIs.
`crates/goose/src/providers/init.rs`	Adds `LocalInferenceProvider` to the provider registry so it can be created and used like other built-in providers.
`crates/goose/src/dictation/download_manager.rs`	Generalizes the download manager to optionally set an arbitrary config key/value on successful download completion, so it can be reused for both dictation and local LLM models.
`crates/goose/Cargo.toml`	Registers new examples (`candle_quantized`, `test_local_provider`) to exercise the local quantized model and provider behavior.
`crates/goose-server/src/routes/utils.rs`	Treats the `local` provider as always “configured” in provider status checks, bypassing API key requirements since local models are configured via file downloads.
`crates/goose-server/src/routes/mod.rs`	Wires the new `local_inference` route module into the main router so its endpoints are served.
`crates/goose-server/src/routes/local_inference.rs`	Defines REST handlers for listing local models, starting downloads of model and tokenizer files, querying combined download progress, canceling downloads, and deleting model/tokenizer files.
`crates/goose-server/src/routes/dictation.rs`	Updates dictation model downloads to use the new generalized download manager signature and set the dictation config key on download completion.
`crates/goose-server/src/openapi.rs`	Registers the new local inference routes and schemas (`LocalModelResponse`, `LocalLlmModel`, `ModelTier`) with utoipa so they appear in the server’s OpenAPI output.

crates/goose-server/src/routes/local_inference.rs

Copilot · 2026-02-03T23:41:59Z

crates/goose-server/src/routes/local_inference.rs

+fn convert_error(e: anyhow::Error) -> ErrorResponse {
+    let error_msg = e.to_string();
+
+    if error_msg.contains("not configured") || error_msg.contains("not found") {
+        ErrorResponse {
+            message: error_msg,
+            status: StatusCode::PRECONDITION_FAILED,
+        }
+    } else if error_msg.contains("already in progress") {
+        ErrorResponse {
+            message: error_msg,
+            status: StatusCode::BAD_REQUEST,
+        }
+    } else {
+        ErrorResponse::internal(error_msg)
+    }


convert_error maps errors containing "not found" to HTTP 412 PRECONDITION_FAILED, but the OpenAPI annotations and generated TypeScript types for the local inference download‑cancel route document a 404 "Download not found" error, so a missing download currently produces a status that does not match the declared API; updating this branch to return 404 for missing downloads will align server behavior with the documented contract and client expectations.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Copilot · 2026-02-04T11:18:53Z

crates/goose-server/src/routes/local_inference.rs

+    // Download tokenizer file (set config and provider when this completes)
+    // We'll set GOOSE_PROVIDER to "local" after the tokenizer download completes
+    // This is handled in the download_manager callback


The comment here says that GOOSE_PROVIDER will be set to "local" when the tokenizer download completes, but the download_model call only passes LOCAL_LLM_MODEL_CONFIG_KEY as the config key, so GOOSE_PROVIDER/GOOSE_MODEL are never updated in this path; either the comment should be corrected or the code should also set those config values if that behavior is required.

Suggested change

// Download tokenizer file (set config and provider when this completes)

// We'll set GOOSE_PROVIDER to "local" after the tokenizer download completes

// This is handled in the download_manager callback

// Download tokenizer file and update the local LLM model config when this completes.

// This is handled in the download_manager callback using LOCAL_LLM_MODEL_CONFIG_KEY.

…into local-models-candle

Copilot

Pull request overview

Copilot reviewed 37 out of 38 changed files in this pull request and generated 7 comments.

Copilot · 2026-02-06T12:44:21Z

crates/goose-server/src/routes/utils.rs

+    // Special override
+    if metadata.name == "local" {
+        return true;


The hardcoded check for provider name "local" bypasses the normal configuration check. While this may be intentional for local models that don't require auth, it should verify that at least one model is downloaded before returning true. Otherwise, the UI may show the provider as configured when no models are available for use.

Suggested change

// Special override

if metadata.name == "local" {

return true;

// Special override for local provider: only consider it configured if at least one model is available.

if metadata.name == "local" {

if let Ok(loaded_provider) = load_provider(metadata.name.as_str()) {

if !loaded_provider.models.is_empty() {

return true;

}

}

Copilot · 2026-02-06T12:44:22Z

crates/goose/src/agents/extension_manager.rs

    ) -> Option<String> {
+        if let Ok(provider_guard) = self.provider.try_lock() {
+            if let Some(provider) = provider_guard.as_ref() {
+                 if provider.get_model_config().context_limit() < 9 * 1024 * 1024 {


Context limit check has incorrect calculation. Line 1485 compares context_limit() against 9 * 1024 * 1024 (9 million tokens), but context_limit() returns token count, not bytes. This threshold is unrealistically high - even the largest models have ~200K token limits. This should be 9 * 1024 (9K tokens) instead.

Suggested change

if provider.get_model_config().context_limit() < 9 * 1024 * 1024 {

if provider.get_model_config().context_limit() < 9 * 1024 {

Copilot · 2026-02-06T12:44:22Z

crates/goose/src/agents/extension_manager.rs

+                eprintln!(
+                    "DEBUG: Loading platform extension '{}': {}",
+                    name, provider_state


Debug logging using eprintln! should be removed or converted to tracing::debug!. These debug statements were likely added during development and should not be left in production code.

Suggested change

eprintln!(

"DEBUG: Loading platform extension '{}': {}",

name, provider_state

tracing::debug!(

"Loading platform extension '{}': {}",

name,

provider_state

Copilot · 2026-02-06T12:44:22Z

crates/goose/src/agents/todo_extension.rs

+        let context_limit = context.get_context_limit();
+        eprintln!(
+            "DEBUG: TodoClient::new - context_limit from provider: {:?}",
+            context_limit
+        );


Debug logging using eprintln! should be removed or converted to tracing::debug!. These debug statements were likely added during development and should not be left in production code.

Copilot · 2026-02-06T12:44:23Z

crates/goose/src/agents/apps_extension.rs

+        eprintln!(
+            "DEBUG: AppsManagerClient::new - model: {:?}, context_limit: {:?}",
+            model_name, context_limit
+        );
+
+        match context.require_min_context(10_000, EXTENSION_NAME) {
+            Ok(_) => eprintln!("DEBUG: AppsManagerClient context check PASSED"),
+            Err(e) => {
+                eprintln!("DEBUG: AppsManagerClient context check FAILED: {}", e);
+                return Err(e.to_string());
+            }
+        }


Debug logging using eprintln! should be removed or converted to tracing::debug!. These debug statements were likely added during development and should not be left in production code.

Copilot · 2026-02-06T12:44:23Z

crates/goose/src/providers/local_inference.rs

+    crate::prompt_template::render_template("tiny_model_system.md", &context)
+        .unwrap_or_else(|e| {
+            // Fallback if template fails to load
+            eprintln!("WARNING: Failed to load tiny_model_system.md: {:?}", e);


Error logging using eprintln! should use tracing::error! or tracing::warn! instead for consistency with the rest of the codebase.

Suggested change

eprintln!("WARNING: Failed to load tiny_model_system.md: {:?}", e);

tracing::warn!("Failed to load tiny_model_system.md: {:?}", e);

Copilot · 2026-02-06T12:44:23Z

ui/desktop/src/components/settings/localInference/LocalInferenceSettings.tsx

+  const pollDownloadProgress = (modelId: string) => {
+    const interval = setInterval(async () => {
+      try {
+        const response = await getLocalModelDownloadProgress({ path: { model_id: modelId } });
+        if (response.data) {
+          const progress = response.data;
+          setDownloads((prev) => new Map(prev).set(modelId, progress));
+
+          if (progress.status === 'completed') {
+            clearInterval(interval);
+            await loadModels(); // Refresh model list
+            // Auto-select the model that was just downloaded
+            await selectModel(modelId);
+          } else if (progress.status === 'failed') {
+            clearInterval(interval);
+            await loadModels();
+          }
+        } else {
+          clearInterval(interval);
+        }
+      } catch {
+        clearInterval(interval);
+      }
+    }, 500);
+  };


Memory leak: interval is not stored and cannot be cleared on component unmount. Store the interval ID in a ref or state and clear it in useEffect cleanup to prevent the polling from continuing after the component unmounts.

And only allow selection of models that have been downloaded

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

* origin/main: (54 commits) chore: strip posthog for sessions/models/daily only (#7079) tidy: clean up old benchmark and add gym (#7081) fix: use command.process_group(0) for CLI providers, not just MCP (#7083) added build notify (#6891) test(mcp): add image tool test and consolidate MCP test fixtures (#7019) fix: remove Option from model listing return types, propagate errors (#7074) fix: lazy provider creation for goose acp (#7026) (#7066) Smoke tests: split compaction test and use debug build (#6984) fix(deps): trim bat to resolve RUSTSEC-2024-0320 (#7061) feat: expose AGENT_SESSION_ID env var to extension child processes (#7072) fix: add XML tool call parsing fallback for Qwen3-coder via Ollama (#6882) Remove clippy too_many_lines lint and decompose long functions (#7064) refactor: move disable_session_naming into AgentConfig (#7062) Add global config switch to disable automatic session naming (#7052) docs: add blog post - 8 Things You Didn't Know About Code Mode (#7059) fix: ensure animated elements are visible when prefers-reduced-motion is enabled (#7047) Show recommended model on failture (#7040) feat(ui): add session content search via API (#7050) docs: fix img url (#7053) Desktop UI for deleting custom providers (#7042) ...

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

The shell tool returns two Content::text items: one for the assistant and one for the user. For short outputs both contain identical text. The callback in code_execution_extension was joining all text content without filtering by audience, causing the output to appear twice. Filter to only include content targeted at the assistant (or with no audience restriction) before joining text results.

* origin/main: (30 commits) docs: GCP Vertex AI org policy filtering & update OnboardingProviderSetup component (#7125) feat: replace subagent and skills with unified summon extension (#6964) feat: add AGENT=goose environment variable for cross-tool compatibility (#7017) fix: strip empty extensions array when deeplink also (#7096) [docs] update authors.yaml file (#7114) Implement manpage generation for goose-cli (#6980) docs: tool output optimization (#7109) Fix duplicated output in Code Mode by filtering content by audience (#7117) Enable tom (Top Of Mind) platform extension by default (#7111) chore: added notification for canary build failure (#7106) fix: fix windows bundle random failure and optimise canary build (#7105) feat(acp): add model selection support for session/new and session/set_model (#7112) fix: isolate claude-code sessions via stream-json session_id (#7108) ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088) docs: codex subscription support (#7104) chore: add a new scenario (#7107) fix: Goose Desktop missing Calendar and Reminders entitlements (#7100) Fix 'Edit In Place' and 'Fork Session' features (#6970) Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082) Docs: require auth optional for custom providers (#7098) ...

Save uses tempfile + rename for atomic writes and fs2 advisory file locks for cross-process safety, preventing corruption from concurrent writes or crashes mid-write.

Avoids potential deadlock when spawn_blocking thread pool is saturated and another task holds the lock waiting for a slot.

…al inference Was awkwardly nested under dictation despite being used by both dictation and local inference. cleanup_partial_downloads already handles missing directories gracefully via if-let on read_dir.

jamadeo

Very nice! I'm relying a bit on the copilot review here as well but this looks great to me.

crates/goose-server/src/routes/dictation.rs

crates/goose/src/agents/extension_manager.rs

jamadeo · 2026-02-19T15:01:32Z

crates/goose/src/agents/extension_manager.rs

    ) -> Option<String> {
+        if let Ok(provider_guard) = self.provider.try_lock() {
+            if let Some(provider) = provider_guard.as_ref() {
+                if provider.get_model_config().context_limit() < 9 * 1024 * 1024 {


Is the intention here to omit this content for models with small context windows?

crates/goose/src/agents/prompt_manager.rs

crates/goose/src/providers/local_inference/inference_emulated_tools.rs

Replace config_key/config_value params with a generic on_complete callback, so the download manager doesn't need to know about config.

Nothing reads this environment variable.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

- Fix undefined 'featured' variable, use response.data instead - Remove onFallbackRequest prop not in AppRendererProps - Clean up unused imports and state

- Change MIN_CONTEXT_FOR_MOIM from 9MB (unreachable) to 32K tokens - Remove embedding models from expected model list (filtered by tool_call)

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

aharvard · 2026-02-19T20:21:05Z

ui/desktop/src/components/McpApps/McpAppRenderer.tsx

Were these changes to the MCP App renderer intentional? This PR removed @alexhancock work from merged in here #7039

fixed via #7366

PR #6933 (local inference provider) was based on a stale branch that didn't include #7039's sampling changes to McpAppRenderer.tsx. When it merged, it silently reverted all client-side sampling plumbing: - RequestHandlerExtra and JSONRPCRequest imports - SamplingCreateMessageParams/Response type imports - apiHost/secretKey state + initialization useEffect - handleFallbackRequest callback (sampling/createMessage handler) - onFallbackRequest prop on AppRenderer The server-side route (sampling.rs), types.ts, and chat.html all survived — only the McpAppRenderer.tsx wiring was lost. This restores the exact code that #7039 added and #6933 removed.

* 'main' of github.com:block/goose: docs: agent variable (#7365) docs: pass env vars to shell (#7361) docs: update sandbox topic (#7336) feat: add local inference provider with llama.cpp backend and HuggingFace model management (#6933)

* main: (46 commits) chore(deps): bump hono from 4.11.9 to 4.12.0 in /ui/desktop (#7369) Include 3rd-party license copy for JavaScript/CSS minified files (#7352) docs for reasoning env var (#7367) docs: update skills detail page to reference Goose Summon extension (#7350) fix(apps): restore MCP app sampling support reverted by #6933 (#7366) feat: TUI client of goose-acp (#7362) docs: agent variable (#7365) docs: pass env vars to shell (#7361) docs: update sandbox topic (#7336) feat: add local inference provider with llama.cpp backend and HuggingFace model management (#6933) Docs: claude code uses stream-json (#7358) Improve link confirmation modal (#7333) fix(ci): deflake smoke tests for Google models (#7344) feat: add Cerebras provider support (#7339) fix: skip whitespace-only text blocks in Anthropic message (#7343) fix(goose-acp): heap allocations (#7322) Remove trailing space from links (#7156) fix: detect low balance and prompt for top up (#7166) feat(apps): add support for MCP apps to sample (#7039) Typescript SDK for ACP extension methods (#7319) ...

* origin/main: (21 commits) feat(ui): show token counts directly for "free" providers (#7383) Update creator note (#7384) Remove display_name from local model API and use model ID everywhere (#7382) fix(summon): stop MOIM from telling models to sleep while waiting for tasks (#7377) Completely pointless ascii art (#7329) feat: add Neighborhood extension to the Extensions Library (#7328) feat: computer controller overhaul, adding peekaboo (#7342) Add blog post: Gastown Explained: How to Use Goosetown for Parallel Agentic Engineering (#7372) docs: type-to-search goose configure lists (#7371) docs: search conversation history (#7370) fix: stderr noise (#7346) chore(deps): bump hono from 4.11.9 to 4.12.0 in /ui/desktop (#7369) Include 3rd-party license copy for JavaScript/CSS minified files (#7352) docs for reasoning env var (#7367) docs: update skills detail page to reference Goose Summon extension (#7350) fix(apps): restore MCP app sampling support reverted by #6933 (#7366) feat: TUI client of goose-acp (#7362) docs: agent variable (#7365) docs: pass env vars to shell (#7361) docs: update sandbox topic (#7336) ... # Conflicts: # Cargo.lock

… HuggingFace model management (block#6933)" This reverts commit ddd35f6.

…kend and HuggingFace model management (block#6933)"" This reverts commit b9bd830.

Local models

9f91446

Copilot AI review requested due to automatic review settings February 3, 2026 23:36

Copilot started reviewing on behalf of DOsinga February 3, 2026 23:36 View session

Copilot AI reviewed Feb 3, 2026

View reviewed changes

Update crates/goose-server/src/routes/local_inference.rs

10011e7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings February 4, 2026 11:14

Copilot started reviewing on behalf of DOsinga February 4, 2026 11:14 View session

fix

3a06f66

Copilot AI reviewed Feb 4, 2026

View reviewed changes

Douwe Osinga added 5 commits February 4, 2026 12:57

fmt

b66e38f

Merge branch 'local-models-candle' of https://github.com/block/goose …

c6af773

…into local-models-candle

WIP: local inference debugging - tokenization fix

193bb0f

Merge origin/main to get candle 0.9

51c73b2

Make streaming work. tiny model is kinda broken

e8bfddf

Copilot AI review requested due to automatic review settings February 6, 2026 12:36

Copilot started reviewing on behalf of DOsinga February 6, 2026 12:36 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

Douwe Osinga and others added 2 commits February 6, 2026 14:02

Simplify?

848b8b0

Only show the "download models" bubble if none have been downloaded

94f1203

And only allow selection of models that have been downloaded

Copilot AI review requested due to automatic review settings February 6, 2026 13:25

Copilot AI reviewed Feb 6, 2026

View reviewed changes

jh-block added 3 commits February 6, 2026 15:19

Handle EOS during streaming using all EOS tokens from the template

293def7

Update fetch_supported_models for new trait method signature

d4f8a20

Copilot AI review requested due to automatic review settings February 9, 2026 12:49

Copilot AI reviewed Feb 9, 2026

View reviewed changes

jh-block added 4 commits February 10, 2026 11:17

Switched local inference to use llama.cpp and added HF model download

bf3193d

merge fixes

821402c

jh-block added 4 commits February 19, 2026 14:47

Use atomic writes and advisory file locks for model registry

6d61d37

Save uses tempfile + rename for atomic writes and fs2 advisory file locks for cross-process safety, preventing corruption from concurrent writes or crashes mid-write.

Use blocking_lock() instead of block_on(lock()) in generation path

a476834

Avoids potential deadlock when spawn_blocking thread pool is saturated and another task holds the lock waiting for a slot.

Move download_manager to top-level module shared by dictation and loc…

296290a

…al inference Was awkwardly nested under dictation despite being used by both dictation and local inference. cleanup_partial_downloads already handles missing directories gracefully via if-let on read_dir.

rustfmt

d9cc3c4

jamadeo approved these changes Feb 19, 2026

View reviewed changes

jh-block added 5 commits February 19, 2026 17:57

Move config setting out of DownloadManager into caller

9a650b7

Replace config_key/config_value params with a generic on_complete callback, so the download manager doesn't need to know about config.

Remove unused GOOSE_CONTEXT_SIZE env var from subprocess setup

c2ca822

Nothing reads this environment variable.

Clarify MOIM context limit check with named constant and comment

1d76d99

Remove unused small_model field from SystemPromptContext

5f84d04

Use string literal .len() for hold-back constants

d19ff34

Copilot AI review requested due to automatic review settings February 19, 2026 17:09

Copilot AI reviewed Feb 19, 2026

View reviewed changes

jh-block added 2 commits February 19, 2026 18:19

Fix TypeScript errors in LocalModelSetup and McpAppRenderer

2678765

- Fix undefined 'featured' variable, use response.data instead - Remove onFallbackRequest prop not in AppRendererProps - Clean up unused imports and state

Fix MOIM context limit threshold and model list test

9f62853

- Change MIN_CONTEXT_FOR_MOIM from 9MB (unreachable) to 32K tokens - Remove embedding models from expected model list (filtered by tool_call)

Copilot AI review requested due to automatic review settings February 19, 2026 17:52

Copilot AI reviewed Feb 19, 2026

View reviewed changes

cargo fmt

634ee59

jh-block added this pull request to the merge queue Feb 19, 2026

Merged via the queue into main with commit ddd35f6 Feb 19, 2026
19 checks passed

jh-block deleted the local-models-candle branch February 19, 2026 18:43

aharvard reviewed Feb 19, 2026

View reviewed changes

aharvard mentioned this pull request Feb 19, 2026

fix(apps): restore MCP app sampling support reverted by #6933 #7366

Merged

github-merge-queue bot pushed a commit that referenced this pull request Feb 19, 2026

fix(apps): restore MCP app sampling support reverted by #6933 (#7366)

f6f0c6e

bavadim added a commit to redsquad-tech/is-goose that referenced this pull request Feb 21, 2026

Revert "feat: add local inference provider with llama.cpp backend and…

b9bd830

… HuggingFace model management (block#6933)" This reverts commit ddd35f6.

bavadim added a commit to redsquad-tech/is-goose that referenced this pull request Feb 21, 2026

Revert "Revert "feat: add local inference provider with llama.cpp bac…

4683395

…kend and HuggingFace model management (block#6933)"" This reverts commit b9bd830.

	if provider.get_model_config().context_limit() < 9 * 1024 * 1024 {
	if provider.get_model_config().context_limit() < 9 * 1024 {

	eprintln!("WARNING: Failed to load tiny_model_system.md: {:?}", e);
	tracing::warn!("Failed to load tiny_model_system.md: {:?}", e);

Comments

Conversation

DOsinga commented Feb 3, 2026 • edited by jh-block Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Core Changes

Additional Changes

Known Limitations (deferred to future iterations)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

jamadeo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jamadeo Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aharvard Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

aharvard Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

DOsinga commented Feb 3, 2026 •

edited by jh-block

Loading