Summarize old tool calls by DOsinga · Pull Request #6119 · block/goose

DOsinga · 2025-12-15T15:31:07Z

Summary

Implements tool call/response compacting on the fly. does in a task so should not increase latency. also makes sure that all messages actually have a message_id that is written to the db and retrieved

Copilot

Pull request overview

This PR implements automatic tool call/response summarization to manage context size, along with infrastructure improvements for message identification. The summarization runs asynchronously to avoid adding latency to the agent loop.

Key Changes

Database schema migration (v7) adds message_id column with index for reliable message tracking
New background task automatically summarizes old tool call/response pairs when threshold is exceeded (configurable via GOOSE_TOOL_CALL_CUTOFF)
All messages now receive a message_id either explicitly via with_generated_id() or implicitly during database insertion

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
crates/goose/src/session/session_manager.rs	Adds schema v7 migration for message_id column, implements update_message_metadata for toggling visibility, ensures message_id is set during add_message
crates/goose/src/conversation/message.rs	Adds with_generated_id() helper method for creating messages with UUIDs
crates/goose/src/context_mgmt/mod.rs	Implements tool call summarization logic: tool_id_to_summarize, summarize_tool_call, and maybe_summarize_tool_pair with comprehensive test coverage
crates/goose/src/agents/agent.rs	Integrates async summarization task, adds tool_call_cut_off config, marks summarized messages as agent_invisible, ensures elicitation messages have IDs
crates/goose-cli/src/commands/term.rs	Ensures term log messages have generated IDs before database insertion
crates/goose/src/providers/api_client.rs	Removes debug logging of LLM request payloads

crates/goose/src/context_mgmt/mod.rs

Copilot · 2025-12-15T15:36:38Z

crates/goose/src/context_mgmt/mod.rs

+                Your task is to summarize a tool call & response pair to save tokens
+
+                reply with a single message that described what happened. Typically a toolcall
+                is asks for something using a bunch of parameters and then the result is also some


Grammar error: "a toolcall is asks" should be "a toolcall asks".

Suggested change

is asks for something using a bunch of parameters and then the result is also some

asks for something using a bunch of parameters and then the result is also some

Copilot · 2025-12-15T15:36:38Z

crates/goose/src/agents/agent.rs

+                            if let Some(id) = &msg.id {
+                                SessionManager::update_message_metadata(&session_config.id, id, |metadata| {
+                                    metadata.with_agent_invisible()
+                                }).await?;


If msg.id is None, the metadata update will be silently skipped, leaving the database in an inconsistent state where the in-memory conversation has agent_invisible metadata but the database doesn't. This could cause issues after session reload. Consider adding an else branch that logs a warning or ensures all messages have IDs before this point.

Suggested change

}).await?;

}).await?;

} else {

warn!("Message without id encountered when updating metadata; database update skipped. This may cause inconsistency between in-memory and persisted state.");

crates/goose/src/context_mgmt/mod.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

katzdave · 2025-12-15T16:51:56Z

crates/goose/src/context_mgmt/mod.rs

+        }
+
+        for content in &msg.content {
+            if let MessageContent::ToolRequest(req) = content {


Should we filter on the response size? Is it worth bothering trying to summarize if its very small.

I tried to measure; an empty tool call/response is 300 tokens, a summary of one sentence (an ls was executed and three files, one of them readme, was found) is about 100, so even that does save something

katzdave · 2025-12-15T17:02:20Z

crates/goose/src/agents/agent.rs

+                            }
+                        }
+                        conversation = Conversation::new_unvalidated(updated_messages);
+                        messages_to_add.push(summary_msg);


Is this summary getting added to the end? (rather than where the original tool call was?)

well, sort of. the summary message gets the same timestamp as the toolcall, so it should sort the right way

ah got it, missed the timestamp sorting.

…oose into summarize-old-tool-calls

Resolved merge conflicts: - session_manager.rs: Kept schema version 7 (includes message_id migration) - agent.rs: Adopted main's execute_command implementation - agent.rs: Adopted main's with_tool_request_with_metadata for better metadata handling Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Fixed drain_elicitation_messages to use manager instance - Fixed prepare_reply_context to use proper function signatures - Fixed apply_migration to use static pool parameter - Fixed SessionManager::instance() call (no longer async) - Added session_id parameter to summarize_tool_call chain - Added task field to CallToolRequestParam in tests - All clippy checks passing Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Copilot · 2026-01-23T18:42:14Z

crates/goose/src/session/session_manager.rs

+    pub async fn update_message_metadata<F>(id: &str, message_id: &str, f: F) -> Result<()>
+    where
+        F: FnOnce(
+            crate::conversation::message::MessageMetadata,
+        ) -> crate::conversation::message::MessageMetadata,
+    {
+        Self::instance()
+            .storage


SessionManager::update_message_metadata is an associated function that always uses Self::instance() (global DB), which bypasses any non-global SessionManager (e.g., tests or alternate data dirs) and can update the wrong storage; make it a &self method and call it via the session_manager already in scope.

Suggested change

pub async fn update_message_metadata<F>(id: &str, message_id: &str, f: F) -> Result<()>

where

F: FnOnce(

crate::conversation::message::MessageMetadata,

) -> crate::conversation::message::MessageMetadata,

{

Self::instance()

.storage

pub async fn update_message_metadata<F>(

&self,

id: &str,

message_id: &str,

f: F,

) -> Result<()>

where

F: FnOnce(

crate::conversation::message::MessageMetadata,

) -> crate::conversation::message::MessageMetadata,

{

self.storage

Copilot · 2026-01-23T18:42:15Z

crates/goose/src/session/session_manager.rs

 use utoipa::ToSchema;

-pub const CURRENT_SCHEMA_VERSION: i32 = 6;
+pub const CURRENT_SCHEMA_VERSION: i32 = 7;


CURRENT_SCHEMA_VERSION is bumped to 7, but create_schema() still creates messages without a message_id column, so fresh DBs will be marked v7 yet get_conversation/add_message will query/insert message_id and fail; update CREATE TABLE messages (and initial indexes) to include message_id from the start.

Suggested change

pub const CURRENT_SCHEMA_VERSION: i32 = 7;

pub const CURRENT_SCHEMA_VERSION: i32 = 6;

Copilot · 2026-01-23T18:42:15Z

crates/goose/src/session/session_manager.rs

        sqlx::query(
            r#"
-            INSERT INTO messages (session_id, role, content_json, created_timestamp, metadata_json)
-            VALUES (?, ?, ?, ?, ?)
+            INSERT INTO messages (message_id, session_id, role, content_json, created_timestamp, metadata_json)
+            VALUES (?, ?, ?, ?, ?, ?)
        "#,


replace_conversation_inner still inserts rows without message_id, which will leave it NULL after schema v7 and cause messages loaded via get_conversation to have id = None (and make later metadata updates by message_id impossible); update replace_conversation_inner to populate message_id the same way as add_message.

Copilot · 2026-01-23T18:42:15Z

crates/goose/src/session/session_manager.rs

+            crate::conversation::message::MessageMetadata,
+        ) -> crate::conversation::message::MessageMetadata,
+    {
+        let mut tx = self.pool.begin().await?;


update_message_metadata starts a transaction on self.pool directly, bypassing the pool().await? initialization/migration path; use let pool = self.pool().await? before beginning the transaction to ensure the schema is initialized.

Suggested change

let mut tx = self.pool.begin().await?;

let pool = self.pool().await?;

let mut tx = pool.begin().await?;

Copilot · 2026-01-23T18:42:16Z

crates/goose/src/session/session_manager.rs

+        let current_metadata_json = sqlx::query_scalar::<_, String>(
+            "SELECT metadata_json FROM messages WHERE message_id = ? AND session_id = ?",
+        )
+        .bind(message_id)
+        .bind(session_id)
+        .fetch_one(&mut *tx)
+        .await?;
+
+        let current_metadata: crate::conversation::message::MessageMetadata =
+            serde_json::from_str(&current_metadata_json)?;
+


metadata_json is nullable in the schema (and older rows may have NULL), but update_message_metadata fetches it as String, so this will error for NULL rows; fetch Option<String> and treat None as MessageMetadata::default() when applying the update.

Copilot · 2026-01-23T18:42:16Z

crates/goose/src/session/session_manager.rs

+                sqlx::query("CREATE INDEX idx_messages_message_id ON messages(message_id)")
+                    .execute(pool)
+                    .await?;


Migration v7 creates an index on messages(message_id) only, but the new lookup path filters by both session_id and message_id; consider indexing (session_id, message_id) (or making it UNIQUE per-session) to avoid full scans on large histories.

Suggested change

sqlx::query("CREATE INDEX idx_messages_message_id ON messages(message_id)")

.execute(pool)

.await?;

sqlx::query(

"CREATE INDEX idx_messages_message_id ON messages(session_id, message_id)"

)

.execute(pool)

.await?;

Copilot · 2026-01-23T18:42:16Z

crates/goose/src/context_mgmt/mod.rs

+use tracing::log::warn;
 use tracing::{debug, info};



use tracing::log::warn; requires the tracing/log compatibility feature and is inconsistent with other modules using tracing::warn; switch to use tracing::warn; (or use tracing::{debug, info, warn};) to avoid a potential compile failure.

Suggested change

use tracing::log::warn;

use tracing::{debug, info};

use tracing::{debug, info, warn};

zanesq · 2026-01-26T19:07:15Z

@DOsinga still want to get this in?

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Copilot · 2026-01-28T13:51:40Z

crates/goose/src/session/session_manager.rs

        sqlx::query(
            r#"
-            INSERT INTO messages (session_id, role, content_json, created_timestamp, metadata_json)
-            VALUES (?, ?, ?, ?, ?)
+            INSERT INTO messages (message_id, session_id, role, content_json, created_timestamp, metadata_json)
+            VALUES (?, ?, ?, ?, ?, ?)
        "#,


replace_conversation_inner still inserts into messages without message_id, so any conversation replacements (including legacy import into a fresh v7 schema) will create rows with NULL message_id and get_conversation will return Message { id: None, .. }; update that insert to include/bind a generated message_id the same way add_message does.

Copilot · 2026-01-28T13:51:40Z

crates/goose/src/session/session_manager.rs

+            crate::conversation::message::MessageMetadata,
+        ) -> crate::conversation::message::MessageMetadata,
+    {
+        let mut tx = self.pool.begin().await?;


update_message_metadata starts a transaction with self.pool.begin() which bypasses the pool().await? initialization/migration guard; use let pool = self.pool().await?; let mut tx = pool.begin().await?; to ensure the schema is ready before issuing queries.

Suggested change

let mut tx = self.pool.begin().await?;

let pool = self.pool().await?;

let mut tx = pool.begin().await?;

DOsinga · 2026-01-28T14:31:35Z

yes I do. been going back and forth on how it exactly should work, let's see what we can do today

Douwe Osinga added 2 commits December 15, 2025 08:03

Summarize old tool calls (1)

fd79442

Require message ids

1fed4e0

DOsinga requested review from Copilot and katzdave and removed request for Copilot December 15, 2025 15:31

Copilot started reviewing on behalf of DOsinga December 15, 2025 15:31 View session

Cutof to 10

e783398

Copilot AI review requested due to automatic review settings December 15, 2025 15:33

Copilot AI reviewed Dec 15, 2025

View reviewed changes

Lint

0a98a06

Copilot started reviewing on behalf of DOsinga December 15, 2025 16:08 View session

Update crates/goose/src/context_mgmt/mod.rs

8a22705

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings December 15, 2025 16:18

DOsinga and others added 2 commits December 15, 2025 11:18

Update crates/goose/src/context_mgmt/mod.rs

57437bc

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update crates/goose/src/context_mgmt/mod.rs

fce2c41

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI reviewed Dec 15, 2025

View reviewed changes

Copilot started reviewing on behalf of DOsinga December 15, 2025 16:50 View session

katzdave reviewed Dec 15, 2025

View reviewed changes

katzdave approved these changes Dec 15, 2025

View reviewed changes

Douwe Osinga and others added 4 commits December 15, 2025 20:56

Simplify

0d1b13c

Merge branch 'summarize-old-tool-calls' of https://github.com/block/g…

a59eb72

…oose into summarize-old-tool-calls

Copilot AI review requested due to automatic review settings January 23, 2026 18:35

Merge remote-tracking branch 'origin/main' into summarize-old-tool-calls

6d02959

Copilot started reviewing on behalf of DOsinga January 23, 2026 18:36 View session

Remove a comment

57a49f7

Copilot AI reviewed Jan 23, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into summarize-old-tool-calls

7531f4e

Douwe Osinga added 2 commits January 27, 2026 17:32

Format

9f1466d

rust

3764f97

Copilot AI review requested due to automatic review settings January 28, 2026 13:42

Copilot started reviewing on behalf of DOsinga January 28, 2026 13:43 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

DOsinga merged commit 9e31cfd into main Jan 28, 2026
22 of 24 checks passed

DOsinga deleted the summarize-old-tool-calls branch January 28, 2026 16:35

This was referenced Jan 29, 2026

chore(release): release version 1.22.0 (minor) #6812

Closed

chore(release): release version 1.22.0 (minor) #6813

Closed

	is asks for something using a bunch of parameters and then the result is also some
	asks for something using a bunch of parameters and then the result is also some

	pub const CURRENT_SCHEMA_VERSION: i32 = 7;
	pub const CURRENT_SCHEMA_VERSION: i32 = 6;

	let mut tx = self.pool.begin().await?;
	let pool = self.pool().await?;
	let mut tx = pool.begin().await?;

	use tracing::log::warn;
	use tracing::{debug, info};
	use tracing::{debug, info, warn};

Conversation

DOsinga commented Dec 15, 2025

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

katzdave Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

DOsinga Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

katzdave Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

DOsinga Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

katzdave Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

zanesq commented Jan 26, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga commented Jan 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels