Skip to content

Summarize old tool calls#6119

Merged
DOsinga merged 16 commits intomainfrom
summarize-old-tool-calls
Jan 28, 2026
Merged

Summarize old tool calls#6119
DOsinga merged 16 commits intomainfrom
summarize-old-tool-calls

Conversation

@DOsinga
Copy link
Collaborator

@DOsinga DOsinga commented Dec 15, 2025

Summary

Implements tool call/response compacting on the fly. does in a task so should not increase latency. also makes sure that all messages actually have a message_id that is written to the db and retrieved

@DOsinga DOsinga requested review from Copilot and katzdave and removed request for Copilot December 15, 2025 15:31
Copilot AI review requested due to automatic review settings December 15, 2025 15:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements automatic tool call/response summarization to manage context size, along with infrastructure improvements for message identification. The summarization runs asynchronously to avoid adding latency to the agent loop.

Key Changes

  • Database schema migration (v7) adds message_id column with index for reliable message tracking
  • New background task automatically summarizes old tool call/response pairs when threshold is exceeded (configurable via GOOSE_TOOL_CALL_CUTOFF)
  • All messages now receive a message_id either explicitly via with_generated_id() or implicitly during database insertion

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
crates/goose/src/session/session_manager.rs Adds schema v7 migration for message_id column, implements update_message_metadata for toggling visibility, ensures message_id is set during add_message
crates/goose/src/conversation/message.rs Adds with_generated_id() helper method for creating messages with UUIDs
crates/goose/src/context_mgmt/mod.rs Implements tool call summarization logic: tool_id_to_summarize, summarize_tool_call, and maybe_summarize_tool_pair with comprehensive test coverage
crates/goose/src/agents/agent.rs Integrates async summarization task, adds tool_call_cut_off config, marks summarized messages as agent_invisible, ensures elicitation messages have IDs
crates/goose-cli/src/commands/term.rs Ensures term log messages have generated IDs before database insertion
crates/goose/src/providers/api_client.rs Removes debug logging of LLM request payloads

Your task is to summarize a tool call & response pair to save tokens

reply with a single message that described what happened. Typically a toolcall
is asks for something using a bunch of parameters and then the result is also some
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar error: "a toolcall is asks" should be "a toolcall asks".

Suggested change
is asks for something using a bunch of parameters and then the result is also some
asks for something using a bunch of parameters and then the result is also some

Copilot uses AI. Check for mistakes.
if let Some(id) = &msg.id {
SessionManager::update_message_metadata(&session_config.id, id, |metadata| {
metadata.with_agent_invisible()
}).await?;
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If msg.id is None, the metadata update will be silently skipped, leaving the database in an inconsistent state where the in-memory conversation has agent_invisible metadata but the database doesn't. This could cause issues after session reload. Consider adding an else branch that logs a warning or ensures all messages have IDs before this point.

Suggested change
}).await?;
}).await?;
} else {
warn!("Message without id encountered when updating metadata; database update skipped. This may cause inconsistency between in-memory and persisted state.");

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings December 15, 2025 16:18
DOsinga and others added 2 commits December 15, 2025 11:18
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

}

for content in &msg.content {
if let MessageContent::ToolRequest(req) = content {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we filter on the response size? Is it worth bothering trying to summarize if its very small.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to measure; an empty tool call/response is 300 tokens, a summary of one sentence (an ls was executed and three files, one of them readme, was found) is about 100, so even that does save something

}
}
conversation = Conversation::new_unvalidated(updated_messages);
messages_to_add.push(summary_msg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this summary getting added to the end? (rather than where the original tool call was?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, sort of. the summary message gets the same timestamp as the toolcall, so it should sort the right way

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah got it, missed the timestamp sorting.

Douwe Osinga and others added 4 commits December 15, 2025 20:56
Resolved merge conflicts:
- session_manager.rs: Kept schema version 7 (includes message_id migration)
- agent.rs: Adopted main's execute_command implementation
- agent.rs: Adopted main's with_tool_request_with_metadata for better metadata handling

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fixed drain_elicitation_messages to use manager instance
- Fixed prepare_reply_context to use proper function signatures
- Fixed apply_migration to use static pool parameter
- Fixed SessionManager::instance() call (no longer async)
- Added session_id parameter to summarize_tool_call chain
- Added task field to CallToolRequestParam in tests
- All clippy checks passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 23, 2026 18:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Comment on lines +367 to +374
pub async fn update_message_metadata<F>(id: &str, message_id: &str, f: F) -> Result<()>
where
F: FnOnce(
crate::conversation::message::MessageMetadata,
) -> crate::conversation::message::MessageMetadata,
{
Self::instance()
.storage
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SessionManager::update_message_metadata is an associated function that always uses Self::instance() (global DB), which bypasses any non-global SessionManager (e.g., tests or alternate data dirs) and can update the wrong storage; make it a &self method and call it via the session_manager already in scope.

Suggested change
pub async fn update_message_metadata<F>(id: &str, message_id: &str, f: F) -> Result<()>
where
F: FnOnce(
crate::conversation::message::MessageMetadata,
) -> crate::conversation::message::MessageMetadata,
{
Self::instance()
.storage
pub async fn update_message_metadata<F>(
&self,
id: &str,
message_id: &str,
f: F,
) -> Result<()>
where
F: FnOnce(
crate::conversation::message::MessageMetadata,
) -> crate::conversation::message::MessageMetadata,
{
self.storage

Copilot uses AI. Check for mistakes.
use utoipa::ToSchema;

pub const CURRENT_SCHEMA_VERSION: i32 = 6;
pub const CURRENT_SCHEMA_VERSION: i32 = 7;
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CURRENT_SCHEMA_VERSION is bumped to 7, but create_schema() still creates messages without a message_id column, so fresh DBs will be marked v7 yet get_conversation/add_message will query/insert message_id and fail; update CREATE TABLE messages (and initial indexes) to include message_id from the start.

Suggested change
pub const CURRENT_SCHEMA_VERSION: i32 = 7;
pub const CURRENT_SCHEMA_VERSION: i32 = 6;

Copilot uses AI. Check for mistakes.
Comment on lines 1118 to 1122
sqlx::query(
r#"
INSERT INTO messages (session_id, role, content_json, created_timestamp, metadata_json)
VALUES (?, ?, ?, ?, ?)
INSERT INTO messages (message_id, session_id, role, content_json, created_timestamp, metadata_json)
VALUES (?, ?, ?, ?, ?, ?)
"#,
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace_conversation_inner still inserts rows without message_id, which will leave it NULL after schema v7 and cause messages loaded via get_conversation to have id = None (and make later metadata updates by message_id impossible); update replace_conversation_inner to populate message_id the same way as add_message.

Copilot uses AI. Check for mistakes.
crate::conversation::message::MessageMetadata,
) -> crate::conversation::message::MessageMetadata,
{
let mut tx = self.pool.begin().await?;
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_message_metadata starts a transaction on self.pool directly, bypassing the pool().await? initialization/migration path; use let pool = self.pool().await? before beginning the transaction to ensure the schema is initialized.

Suggested change
let mut tx = self.pool.begin().await?;
let pool = self.pool().await?;
let mut tx = pool.begin().await?;

Copilot uses AI. Check for mistakes.
Comment on lines +1404 to +1414
let current_metadata_json = sqlx::query_scalar::<_, String>(
"SELECT metadata_json FROM messages WHERE message_id = ? AND session_id = ?",
)
.bind(message_id)
.bind(session_id)
.fetch_one(&mut *tx)
.await?;

let current_metadata: crate::conversation::message::MessageMetadata =
serde_json::from_str(&current_metadata_json)?;

Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata_json is nullable in the schema (and older rows may have NULL), but update_message_metadata fetches it as String, so this will error for NULL rows; fetch Option<String> and treat None as MessageMetadata::default() when applying the update.

Copilot uses AI. Check for mistakes.
Comment on lines +873 to +875
sqlx::query("CREATE INDEX idx_messages_message_id ON messages(message_id)")
.execute(pool)
.await?;
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration v7 creates an index on messages(message_id) only, but the new lookup path filters by both session_id and message_id; consider indexing (session_id, message_id) (or making it UNIQUE per-session) to avoid full scans on large histories.

Suggested change
sqlx::query("CREATE INDEX idx_messages_message_id ON messages(message_id)")
.execute(pool)
.await?;
sqlx::query(
"CREATE INDEX idx_messages_message_id ON messages(session_id, message_id)"
)
.execute(pool)
.await?;

Copilot uses AI. Check for mistakes.
Comment on lines +14 to 16
use tracing::log::warn;
use tracing::{debug, info};

Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tracing::log::warn; requires the tracing/log compatibility feature and is inconsistent with other modules using tracing::warn; switch to use tracing::warn; (or use tracing::{debug, info, warn};) to avoid a potential compile failure.

Suggested change
use tracing::log::warn;
use tracing::{debug, info};
use tracing::{debug, info, warn};

Copilot uses AI. Check for mistakes.
@zanesq
Copy link
Collaborator

zanesq commented Jan 26, 2026

@DOsinga still want to get this in?

Douwe Osinga added 2 commits January 27, 2026 17:32
Copilot AI review requested due to automatic review settings January 28, 2026 13:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment on lines 1122 to 1126
sqlx::query(
r#"
INSERT INTO messages (session_id, role, content_json, created_timestamp, metadata_json)
VALUES (?, ?, ?, ?, ?)
INSERT INTO messages (message_id, session_id, role, content_json, created_timestamp, metadata_json)
VALUES (?, ?, ?, ?, ?, ?)
"#,
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace_conversation_inner still inserts into messages without message_id, so any conversation replacements (including legacy import into a fresh v7 schema) will create rows with NULL message_id and get_conversation will return Message { id: None, .. }; update that insert to include/bind a generated message_id the same way add_message does.

Copilot uses AI. Check for mistakes.
crate::conversation::message::MessageMetadata,
) -> crate::conversation::message::MessageMetadata,
{
let mut tx = self.pool.begin().await?;
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_message_metadata starts a transaction with self.pool.begin() which bypasses the pool().await? initialization/migration guard; use let pool = self.pool().await?; let mut tx = pool.begin().await?; to ensure the schema is ready before issuing queries.

Suggested change
let mut tx = self.pool.begin().await?;
let pool = self.pool().await?;
let mut tx = pool.begin().await?;

Copilot uses AI. Check for mistakes.
@DOsinga
Copy link
Collaborator Author

DOsinga commented Jan 28, 2026

yes I do. been going back and forth on how it exactly should work, let's see what we can do today

@DOsinga DOsinga merged commit 9e31cfd into main Jan 28, 2026
22 of 24 checks passed
@DOsinga DOsinga deleted the summarize-old-tool-calls branch January 28, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants