Skip to content

Comments

feat: CLI-via-Goosed unified agent architecture with multi-agent routing#7238

Draft
bioinfornatics wants to merge 381 commits intoblock:mainfrom
bioinfornatics:feature/cli-via-goosed
Draft

feat: CLI-via-Goosed unified agent architecture with multi-agent routing#7238
bioinfornatics wants to merge 381 commits intoblock:mainfrom
bioinfornatics:feature/cli-via-goosed

Conversation

@bioinfornatics
Copy link

CLI-via-Goosed: Unified Agent Architecture

Summary

This PR introduces a unified architecture where the CLI communicates with agents through goosed (the server binary), aligning desktop and CLI on a single communication path. It also adds multi-agent orchestration with an intent router, ACP/A2A protocol compatibility, and comprehensive UI improvements.

Key Changes

🏗️ Architecture: CLI-via-Goosed

  • CLI now communicates through goosed server instead of directly instantiating agents
  • GoosedClient manages server lifecycle (spawn, health check, graceful shutdown)
  • Process discovery & reuse via PID state file (~/.config/goose/goosed.state)
  • goose service install|uninstall|status|logs for managed daemon lifecycle (systemd/launchd)

🤖 Multi-Agent System

  • GooseAgent: 7 behavioral modes (assistant, specialist, recipe_maker, app_maker, app_iterator, judge, planner)
  • CodingAgent: 8 SDLC modes (pm, architect, backend, frontend, qa, security, sre, devsecops)
  • IntentRouter: Keyword-based routing with fuzzy prefix matching and configurable confidence thresholds
  • OrchestratorAgent: LLM-based meta-coordinator with fallback to IntentRouter
  • Internal modes (judge, planner, recipe_maker) filtered from public discovery

📡 Protocol Compatibility

  • ACP (Agent Communication Protocol): Full run lifecycle (create → stream → complete/cancel), elicitation, await flows
  • A2A (Agent-to-Agent): Dynamic agent card generation from IntentRouter slots
  • RunStore: Single-mutex design with LRU eviction (MAX_COMPLETED_RUNS=1000), TOCTOU-safe resume
  • ACP-IDE WebSocket: Session mode state with available/current mode tracking, notification forwarding

📊 Analytics & Observability

  • Routing analytics endpoints: POST /analytics/routing/inspect, POST /analytics/routing/eval, GET /analytics/routing/catalog
  • Routing evaluation framework: YAML-based test sets (29 cases), per-agent/mode accuracy metrics, confusion matrix
  • OpenTelemetry spans: orchestrator.route, orchestrator.llm_classify, intent_router.route
  • AgentEvent::PlanCreated: New event variant for orchestration plan tracking

🖥️ UI Improvements

  • WorkBlockIndicator: Collapsible tool-call chains with auto-open streaming, live-update panel
  • Progressive message rendering: Two-tier final answer detection, suppress transient tool call flash
  • Agent management: Dedup agents by ID, mode switching
  • ReasoningDetailPanel: Enhanced for work blocks with streaming support
  • Refactored hooks: useChatStream split into streamReducer.ts + streamDecoder.ts (860→576 lines)

🔒 Security & Reliability

  • Concurrency limit (10) on all /runs endpoints via ServiceBuilder
  • Structured ErrorResponse on all 11 bare StatusCode returns in runs.rs
  • ErrorResponse::bad_request() and conflict() constructors added
  • AcpIdeSessions LRU eviction (MAX_IDE_SESSIONS=100) with idle timeout

Quality Gates

Gate Status
cargo build --all-targets
cargo fmt --check
cargo clippy --all-targets -- -D warnings
cargo test -p goose --lib (789 tests)
cargo test -p goose-server (40 tests)
npx tsc --noEmit
npx vitest run (325/326, 1 pre-existing)
npx eslint
Merge conflict check ✅ Clean

New Test Coverage

  • 14 RunStore lifecycle tests: create/get, status transitions, await/elicitation, cancellation, events, output, errors, pagination, eviction
  • 23 WorkBlock non-regression tests: streaming, completed, tool chains, final answer detection, dual indicator prevention
  • 7 routing evaluation tests: YAML parsing, accuracy thresholds, metrics computation, report generation
  • 6 SSE parser tests: event boundary handling, multi-event buffers, partial data
  • 6 IntentRouter tests: keyword routing, fallback, disabled agents

Routing Evaluation Baseline

Overall: 41.4% (keyword router — LLM router pending)
Goose Agent:  100% (5/5)
Coding Agent:  33% (8/24)
Best modes:  architect 100%, qa 100%
Worst modes: frontend 0%, devsecops 0%, backend 20%

Files Changed

  • Rust: ~30 new/modified files across goose, goose-server, goose-cli, goose-mcp
  • TypeScript/React: ~15 new/modified files in ui/desktop
  • Tests: 50+ new tests (Rust + Vitest)
  • Docs: Architecture review, analytics backlog, protocol analysis

Follow-up Work

  • BL-2: Analytics UI dashboard (3-tab React page)
  • BL-3: Live user feedback (👍/👎 on routing)
  • Agent extraction: QA, PM, Security as standalone agents
  • LLM-based router (replace keyword matching)
  • Full OTel dashboard integration

@michaelneale
Copy link
Collaborator

thanks @bioinfornatics I like that general idea - looks like a lot of work to tidy up conflicts but would like to see what it looks like if you could show it here.

@bioinfornatics bioinfornatics force-pushed the feature/cli-via-goosed branch 6 times, most recently from e8dc59e to 4b68302 Compare February 17, 2026 12:59
@DOsinga
Copy link
Collaborator

DOsinga commented Feb 17, 2026

this is a massive change! I like the ideas, but I think we should discuss some of them separately (where we do we want to go with clients?) but also, it seems to introduce 5 big ideas, shouldn't we split those up?

…ution

Implements the execution dispatch layer that bridges the orchestrator's
routing decisions to actual execution backends:

- Dispatcher trait: async dispatch_one/dispatch_all interface
- InProcessDispatcher: local provider-based single-turn execution
- A2ADispatcher: remote agent execution via A2A HTTP client
- CompositeDispatcher: routes to correct backend based on DelegationStrategy
- Concurrent fan-out for compound requests via futures::join_all
- DispatchEvent broadcast for observability (started/progress/completed/failed)
- Full test coverage for serialization, strategy validation, event variants

Re-exports dispatch types through agents::orchestration facade module.

Closes goose4-79bo
1. Fix Catalog/Workflows menu item swap — removed duplicate '/recipes' from
   Catalog zone, moved Apps from Workflows to Catalog zone
2. Apps no longer nested under Workflows — now properly in Catalog zone
   alongside Agents and Extensions
3. React DevTools: add fallback to load from local Chromium installation
   when electron-devtools-installer fails (known issue with Electron 40+)

Sidebar zones after fix:
  - Workflows: Recipes, Scheduler
  - Catalog: Agents, Extensions, Apps
  - (removed duplicate Workflows entry from Catalog)
Add route: '/recipes' to Workflows zone so clicking the zone label
navigates to the recipes overview (same UX as Catalog -> /catalogs).
Both multi-item zones now behave identically:
  - Label click -> navigates to overview page
  - Chevron click -> toggles collapse/expand of sub-items
Adds a new GET /agents/catalog route that merges three agent sources
into a single view for the Platform zone UI:

1. Builtin agents - from OrchestratorAgent slots with modes/capabilities
2. External ACP agents - from the ACP client manager
3. A2A instances - from the AgentPool with live status

New types: CatalogAgent, CatalogAgentKind, CatalogAgentStatus,
CatalogAgentMode, AgentCatalogResponse

Closes goose4-9ib6
…t dropdown

- Enhanced SessionList with always-on project grouping (≥2 projects or ≥5 sessions)
- Added localStorage-backed project preferences (useProjectPreferences hook):
  - Pinnable projects (pinned always sort to top, visible pin icon)
  - Collapse state persistence across restarts
  - Recent project directories tracking (up to 10)
- Added 'Open Project' dropdown button (FolderPlus icon) next to Chat:
  - 'Browse...' option opens OS directory picker via IPC
  - Recent project dirs listed below for quick access
  - Creates new session in selected directory
- Home icon for 'General' project group (sessions without working_dir)
- 10-project cap with search input when overflow
- groupSessionsByProject now supports pin-aware sorting
- Add CatalogAgent, CatalogAgentKind, CatalogAgentStatus,
  CatalogAgentMode, AgentCatalogResponse to schemas
- Add agent_catalog endpoint to paths
- Regenerate frontend SDK types via just generate-openapi
…rs show parent context

- Temp directories (/tmp/.tmp*, /tmp/tmp*) now grouped under 'General' instead of individual groups
- Home directory sessions (no project context) grouped under 'General'
- Nested subdirectories show parent context: 'goose4 › crates/goose' instead of just 'goose'
- Verified with 278 real sessions: goose4 (101), General (96), goose4 › crates/goose (81)
… dir detection

- General group now sorts first (was last)
- General group always present even with 0 sessions
- Temp dirs (/tmp/.tmp*, /tmp/tmp*) correctly grouped under General
- Home dir detection uses regex patterns instead of unavailable appConfig keys
- Nested subdirs show parent context: 'goose4 › crates/goose'
When OrchestratorAgent detects a compound request (is_compound=true with
multiple sub-tasks), the reply route now:
1. Iterates over each sub-task in the plan
2. Reconfigures the agent for each sub-task's routing (apply_routing)
3. Executes each sub-task via agent.reply() sequentially
4. Aggregates results using aggregate_results()
5. Streams the combined result back as a single assistant message

This closes the last gap between the orchestrator's routing intelligence
and actual multi-agent execution. Single-task requests continue to use
the existing direct reply path unchanged.

Phase 1: sequential fan-out. Phase 2 will use AgentPool for parallel
execution of independent sub-tasks.
…ete via API

- Add Trash2 icon and onDeleteSession handler to SessionItem
- Wire deleteSession API call with confirm dialog in AppSidebar
- Navigate home if active session is deleted
- Dispatch SESSION_DELETED event for sidebar state sync
- Pass onDeleteSession through SessionList to flat and grouped views
- Add X button on project group headers (hover to reveal)
- Click confirms then deletes all sessions in that project
- Uses existing DELETE /sessions/{id} API for each session
- Navigates home if active session was in the closed project
- Dispatches SESSION_DELETED events for state sync
- handleCloseProject wired through SessionList → AppSidebar
…er agent

- Add SlotDelegation enum (InProcess, ExternalAcp, RemoteA2A{url})
- Add delegation_strategies field to AgentSlotRegistry
- Add get_delegation, set_delegation, register_a2a_agent, register_acp_agent, unregister_agent methods
- Wire register_acp_agent into connect_agent handler
- Wire unregister_agent into disconnect_agent handler
- Add 4 new tests covering delegation defaults, A2A registration, ACP registration, unregistration
- React Flow v12 (@xyflow/react) based visual pipeline editor
- 7 node types: Trigger, Agent, Tool, Condition, Transform, Human, A2A
- Drag-and-drop NodePalette with color-coded node kinds
- PropertiesPanel with type-specific config forms per node kind
- Undo/redo history (50-step cap)
- YAML/JSON export via clipboard
- Pipeline serialization (flowToPipeline, pipelineToFlow)
- Route at /pipelines, sidebar entry under Workflows zone
- Full TypeScript types for pipeline format (apiVersion goose/v1)
- Semantic design tokens throughout, zero hardcoded colors
…oard, /active-agents, /health)

- Add observatory.rs route module with 3 endpoints:
  - GET /observatory/dashboard — unified system health + active agents + performance snapshot
  - GET /observatory/active-agents — real-time agent status from pool + registry
  - GET /observatory/health — system health check with agent counts + status
- Register observatory paths and types in OpenAPI schema
- Generate frontend SDK with ObservatoryDashboard, SystemHealth, HealthStatus,
  ActiveAgent, ActiveAgentKind, ActiveAgentStatus, PerformanceSnapshot, TopTool types
- Comment out broken pipeline utoipa references (frontend agent WIP)
- Add all_agent_names() to AgentSlotRegistry for enumerating all known agents
- Replace hardcoded ['Goose Agent', 'Developer Agent'] sync in reply.rs
  with dynamic loop over all registered agents (builtin + external + A2A)
- Comment out pipeline module/routes/openapi refs (frontend agent WIP,
  missing utoipa annotations — will be re-enabled when complete)
- Ensures per-agent extension scoping works for dynamically registered agents
Backend (Rust):
- crates/goose/src/pipeline.rs: Pipeline model, YAML/JSON serialization,
  validation with cycle detection, file-based CRUD storage
- crates/goose-server/src/routes/pipeline.rs: REST API handlers for
  list/get/save/update/delete/validate pipelines with utoipa OpenAPI annotations
- Registered pipeline module in lib.rs, routes/mod.rs, openapi.rs

Frontend (TypeScript):
- PipelineManager.tsx: Full pipeline list view with load/save/delete, wired
  to real API endpoints (listPipelines, savePipeline, getPipeline, deletePipeline)
- Updated App.tsx route to use PipelineManager instead of standalone DagEditor
- OpenAPI regenerated with all pipeline types and endpoints

Pipeline storage: ~/.config/goose/pipelines/*.yaml
Backend (Rust):
- crates/goose/src/pipeline.rs: Pipeline model with YAML/JSON serialization,
  validation with cycle detection, file-based CRUD in ~/.config/goose/pipelines/
- crates/goose-server/src/routes/pipeline.rs: REST API handlers for
  list/get/save/update/delete/validate with utoipa OpenAPI annotations
- Registered pipeline module in lib.rs, routes enabled in mod.rs + openapi.rs

Frontend (TypeScript):
- PipelineManager.tsx: Pipeline list view with create/load/save/delete,
  wired to real API (listPipelines, savePipeline, getPipeline, deletePipeline)
- Updated App.tsx route to use PipelineManager wrapper around DagEditor
- OpenAPI regenerated with all pipeline types and 6 endpoints
- Updated workflows/index.ts exports
Knowledge Graph:
- +29 entities (total: 1,595), +36 relations (total: 2,258)
- New: Pipeline system (backend + frontend), Session management,
  Generative UI integration, Design decisions, Findings

Docs:
- docs/non-obvious-knowledge.mdx: Hard-won insights (OpenAPI pipeline,
  React Flow compat, session grouping gotchas, design tokens, CSP warnings)
- docs/reviews/ui-gap-analysis.mdx: Design vs implementation gap analysis
- docs/reviews/ui-implementation-plan.mdx: 5-phase implementation plan
- docs/reviews/project-sessions-assessment.mdx: Team assessment for
  project-grouped sessions feature
- Check SlotDelegation per sub-task in compound fan-out loop
- RemoteA2A slots dispatch via A2AClient.send_message() over HTTP
- InProcess/ExternalAcp slots continue using agent.reply()
- Extract text from A2A Task artifacts or Message parts
- Log delegation type per sub-task for observability
- Completes the A2A dispatch loop: routing → delegation → execution
- Uncomment pub mod pipeline in routes/mod.rs
- Uncomment pipeline::routes merge in configure()
- Uncomment all pipeline path handlers in openapi.rs
- Uncomment all pipeline schema types in openapi.rs
- Add NodePosition + Viewport to OpenAPI schemas
- Add instances.ts restore step to Justfile generate-openapi
- These were repeatedly commented out by concurrent backend work
- Create WorkflowsOverview with 3 category cards: Recipes, Pipelines, Schedules
- Each card loads real data from API (listSavedRecipes, listPipelines, listSchedules)
- Shows status indicators (active/paused/draft), item counts, search, refresh
- Update sidebar: Workflows zone route /recipes → /workflows
- Add /workflows route in App.tsx
- Pattern matches CatalogsOverview for consistent UX
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants