Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements automatic tool call/response summarization to manage context size, along with infrastructure improvements for message identification. The summarization runs asynchronously to avoid adding latency to the agent loop.
Key Changes
- Database schema migration (v7) adds
message_idcolumn with index for reliable message tracking - New background task automatically summarizes old tool call/response pairs when threshold is exceeded (configurable via
GOOSE_TOOL_CALL_CUTOFF) - All messages now receive a
message_ideither explicitly viawith_generated_id()or implicitly during database insertion
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/goose/src/session/session_manager.rs | Adds schema v7 migration for message_id column, implements update_message_metadata for toggling visibility, ensures message_id is set during add_message |
| crates/goose/src/conversation/message.rs | Adds with_generated_id() helper method for creating messages with UUIDs |
| crates/goose/src/context_mgmt/mod.rs | Implements tool call summarization logic: tool_id_to_summarize, summarize_tool_call, and maybe_summarize_tool_pair with comprehensive test coverage |
| crates/goose/src/agents/agent.rs | Integrates async summarization task, adds tool_call_cut_off config, marks summarized messages as agent_invisible, ensures elicitation messages have IDs |
| crates/goose-cli/src/commands/term.rs | Ensures term log messages have generated IDs before database insertion |
| crates/goose/src/providers/api_client.rs | Removes debug logging of LLM request payloads |
| Your task is to summarize a tool call & response pair to save tokens | ||
|
|
||
| reply with a single message that described what happened. Typically a toolcall | ||
| is asks for something using a bunch of parameters and then the result is also some |
There was a problem hiding this comment.
Grammar error: "a toolcall is asks" should be "a toolcall asks".
| is asks for something using a bunch of parameters and then the result is also some | |
| asks for something using a bunch of parameters and then the result is also some |
crates/goose/src/agents/agent.rs
Outdated
| if let Some(id) = &msg.id { | ||
| SessionManager::update_message_metadata(&session_config.id, id, |metadata| { | ||
| metadata.with_agent_invisible() | ||
| }).await?; |
There was a problem hiding this comment.
If msg.id is None, the metadata update will be silently skipped, leaving the database in an inconsistent state where the in-memory conversation has agent_invisible metadata but the database doesn't. This could cause issues after session reload. Consider adding an else branch that logs a warning or ensures all messages have IDs before this point.
| }).await?; | |
| }).await?; | |
| } else { | |
| warn!("Message without id encountered when updating metadata; database update skipped. This may cause inconsistency between in-memory and persisted state."); |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| } | ||
|
|
||
| for content in &msg.content { | ||
| if let MessageContent::ToolRequest(req) = content { |
There was a problem hiding this comment.
Should we filter on the response size? Is it worth bothering trying to summarize if its very small.
There was a problem hiding this comment.
I tried to measure; an empty tool call/response is 300 tokens, a summary of one sentence (an ls was executed and three files, one of them readme, was found) is about 100, so even that does save something
| } | ||
| } | ||
| conversation = Conversation::new_unvalidated(updated_messages); | ||
| messages_to_add.push(summary_msg); |
There was a problem hiding this comment.
Is this summary getting added to the end? (rather than where the original tool call was?)
There was a problem hiding this comment.
well, sort of. the summary message gets the same timestamp as the toolcall, so it should sort the right way
There was a problem hiding this comment.
ah got it, missed the timestamp sorting.
…oose into summarize-old-tool-calls
Resolved merge conflicts: - session_manager.rs: Kept schema version 7 (includes message_id migration) - agent.rs: Adopted main's execute_command implementation - agent.rs: Adopted main's with_tool_request_with_metadata for better metadata handling Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fixed drain_elicitation_messages to use manager instance - Fixed prepare_reply_context to use proper function signatures - Fixed apply_migration to use static pool parameter - Fixed SessionManager::instance() call (no longer async) - Added session_id parameter to summarize_tool_call chain - Added task field to CallToolRequestParam in tests - All clippy checks passing Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
| pub async fn update_message_metadata<F>(id: &str, message_id: &str, f: F) -> Result<()> | ||
| where | ||
| F: FnOnce( | ||
| crate::conversation::message::MessageMetadata, | ||
| ) -> crate::conversation::message::MessageMetadata, | ||
| { | ||
| Self::instance() | ||
| .storage |
There was a problem hiding this comment.
SessionManager::update_message_metadata is an associated function that always uses Self::instance() (global DB), which bypasses any non-global SessionManager (e.g., tests or alternate data dirs) and can update the wrong storage; make it a &self method and call it via the session_manager already in scope.
| pub async fn update_message_metadata<F>(id: &str, message_id: &str, f: F) -> Result<()> | |
| where | |
| F: FnOnce( | |
| crate::conversation::message::MessageMetadata, | |
| ) -> crate::conversation::message::MessageMetadata, | |
| { | |
| Self::instance() | |
| .storage | |
| pub async fn update_message_metadata<F>( | |
| &self, | |
| id: &str, | |
| message_id: &str, | |
| f: F, | |
| ) -> Result<()> | |
| where | |
| F: FnOnce( | |
| crate::conversation::message::MessageMetadata, | |
| ) -> crate::conversation::message::MessageMetadata, | |
| { | |
| self.storage |
| use utoipa::ToSchema; | ||
|
|
||
| pub const CURRENT_SCHEMA_VERSION: i32 = 6; | ||
| pub const CURRENT_SCHEMA_VERSION: i32 = 7; |
There was a problem hiding this comment.
CURRENT_SCHEMA_VERSION is bumped to 7, but create_schema() still creates messages without a message_id column, so fresh DBs will be marked v7 yet get_conversation/add_message will query/insert message_id and fail; update CREATE TABLE messages (and initial indexes) to include message_id from the start.
| pub const CURRENT_SCHEMA_VERSION: i32 = 7; | |
| pub const CURRENT_SCHEMA_VERSION: i32 = 6; |
| sqlx::query( | ||
| r#" | ||
| INSERT INTO messages (session_id, role, content_json, created_timestamp, metadata_json) | ||
| VALUES (?, ?, ?, ?, ?) | ||
| INSERT INTO messages (message_id, session_id, role, content_json, created_timestamp, metadata_json) | ||
| VALUES (?, ?, ?, ?, ?, ?) | ||
| "#, |
There was a problem hiding this comment.
replace_conversation_inner still inserts rows without message_id, which will leave it NULL after schema v7 and cause messages loaded via get_conversation to have id = None (and make later metadata updates by message_id impossible); update replace_conversation_inner to populate message_id the same way as add_message.
| crate::conversation::message::MessageMetadata, | ||
| ) -> crate::conversation::message::MessageMetadata, | ||
| { | ||
| let mut tx = self.pool.begin().await?; |
There was a problem hiding this comment.
update_message_metadata starts a transaction on self.pool directly, bypassing the pool().await? initialization/migration path; use let pool = self.pool().await? before beginning the transaction to ensure the schema is initialized.
| let mut tx = self.pool.begin().await?; | |
| let pool = self.pool().await?; | |
| let mut tx = pool.begin().await?; |
| let current_metadata_json = sqlx::query_scalar::<_, String>( | ||
| "SELECT metadata_json FROM messages WHERE message_id = ? AND session_id = ?", | ||
| ) | ||
| .bind(message_id) | ||
| .bind(session_id) | ||
| .fetch_one(&mut *tx) | ||
| .await?; | ||
|
|
||
| let current_metadata: crate::conversation::message::MessageMetadata = | ||
| serde_json::from_str(¤t_metadata_json)?; | ||
|
|
There was a problem hiding this comment.
metadata_json is nullable in the schema (and older rows may have NULL), but update_message_metadata fetches it as String, so this will error for NULL rows; fetch Option<String> and treat None as MessageMetadata::default() when applying the update.
| sqlx::query("CREATE INDEX idx_messages_message_id ON messages(message_id)") | ||
| .execute(pool) | ||
| .await?; |
There was a problem hiding this comment.
Migration v7 creates an index on messages(message_id) only, but the new lookup path filters by both session_id and message_id; consider indexing (session_id, message_id) (or making it UNIQUE per-session) to avoid full scans on large histories.
| sqlx::query("CREATE INDEX idx_messages_message_id ON messages(message_id)") | |
| .execute(pool) | |
| .await?; | |
| sqlx::query( | |
| "CREATE INDEX idx_messages_message_id ON messages(session_id, message_id)" | |
| ) | |
| .execute(pool) | |
| .await?; |
| use tracing::log::warn; | ||
| use tracing::{debug, info}; | ||
|
|
There was a problem hiding this comment.
use tracing::log::warn; requires the tracing/log compatibility feature and is inconsistent with other modules using tracing::warn; switch to use tracing::warn; (or use tracing::{debug, info, warn};) to avoid a potential compile failure.
| use tracing::log::warn; | |
| use tracing::{debug, info}; | |
| use tracing::{debug, info, warn}; |
|
@DOsinga still want to get this in? |
| sqlx::query( | ||
| r#" | ||
| INSERT INTO messages (session_id, role, content_json, created_timestamp, metadata_json) | ||
| VALUES (?, ?, ?, ?, ?) | ||
| INSERT INTO messages (message_id, session_id, role, content_json, created_timestamp, metadata_json) | ||
| VALUES (?, ?, ?, ?, ?, ?) | ||
| "#, |
There was a problem hiding this comment.
replace_conversation_inner still inserts into messages without message_id, so any conversation replacements (including legacy import into a fresh v7 schema) will create rows with NULL message_id and get_conversation will return Message { id: None, .. }; update that insert to include/bind a generated message_id the same way add_message does.
| crate::conversation::message::MessageMetadata, | ||
| ) -> crate::conversation::message::MessageMetadata, | ||
| { | ||
| let mut tx = self.pool.begin().await?; |
There was a problem hiding this comment.
update_message_metadata starts a transaction with self.pool.begin() which bypasses the pool().await? initialization/migration guard; use let pool = self.pool().await?; let mut tx = pool.begin().await?; to ensure the schema is ready before issuing queries.
| let mut tx = self.pool.begin().await?; | |
| let pool = self.pool().await?; | |
| let mut tx = pool.begin().await?; |
|
yes I do. been going back and forth on how it exactly should work, let's see what we can do today |
Summary
Implements tool call/response compacting on the fly. does in a task so should not increase latency. also makes sure that all messages actually have a message_id that is written to the db and retrieved