Skip to content

Comments

perf(core): optimize chat recording and add universal tool truncation#18668

Closed
mattKorwel wants to merge 3 commits intomainfrom
fix/memory-optimizations
Closed

perf(core): optimize chat recording and add universal tool truncation#18668
mattKorwel wants to merge 3 commits intomainfrom
fix/memory-optimizations

Conversation

@mattKorwel
Copy link
Collaborator

Summary

This PR introduces critical memory optimizations to the core chat recording and tool execution logic to prevent OOM crashes on large repositories.

Details

  • Universal Tool Output Truncation: Extended truncation logic in ToolExecutor to all tools (not just shell). Large outputs are offloaded to disk and replaced with a placeholder link in the history.
  • Optimized ChatRecordingService:
    • Implemented in-memory caching of the conversation record to avoid frequent disk I/O and JSON parsing.
    • Reduced redundant JSON stringification during message updates by checking for changes against a cached string.
    • Added safety checks for partial conversation records during initialization.

Related Issues

Fixes the primary cause of out-of-memory error crashes reported during large-scale repository assessment.

How to Validate

  1. Run the CLI on a large repository.
  2. Trigger tools that return large outputs (e.g., ls on a root directory with many files).
  3. Verify that the session file in ~/.gemini/tmp/.../chats/ remains a reasonable size.
  4. Verify that truncated outputs point to valid files in ~/.gemini/tmp/.../tool-outputs/.
  5. Run npm test -w @google/gemini-cli-core -- src/services/chatRecordingService.test.ts to ensure core recording logic is intact.

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run

@mattKorwel mattKorwel requested a review from a team as a code owner February 9, 2026 20:02
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @mattKorwel, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces significant performance and memory optimizations to the core chat recording and tool execution logic. The primary goal is to prevent out-of-memory (OOM) crashes, especially when dealing with large repositories or extensive tool outputs, by implementing universal tool output truncation and enhancing the efficiency of conversation record management.

Highlights

  • Universal Tool Output Truncation: The tool output truncation logic in ToolExecutor has been extended to apply to all tools, not just the shell tool. This ensures that large outputs from any tool are offloaded to disk and replaced with a placeholder link in the conversation history, preventing memory issues.
  • Optimized ChatRecordingService: The ChatRecordingService now includes in-memory caching of the conversation record, significantly reducing frequent disk I/O and JSON parsing operations. This improves performance and reduces memory pressure.
  • Reduced Redundant JSON Stringification: The ChatRecordingService has been optimized to prevent redundant JSON stringification during message updates. It now checks if the conversation content has actually changed against a cached string before re-stringifying and writing to disk, further saving CPU cycles and memory.
Changelog
  • packages/core/src/scheduler/tool-executor.ts
    • Removed the import for SHELL_TOOL_NAME as the truncation logic is no longer specific to the shell tool.
    • Modified the condition for truncating tool output to apply universally to any string content, rather than being restricted to only SHELL_TOOL_NAME.
  • packages/core/src/services/chatRecordingService.ts
    • Added a new private property conversation: ConversationRecord | null to store the conversation record in memory, enabling caching.
    • Updated the initialization logic to populate the conversation cache when resuming an existing session or creating a new one.
    • Modified readConversation to first check the in-memory cache (this.conversation) before attempting to read and parse the conversation record from disk.
    • Ensured that the placeholder empty conversation created when a file doesn't exist is also stored in the in-memory cache.
    • Improved writeConversation to prevent redundant JSON.stringify calls by comparing the current content with cachedLastConvData before updating the lastUpdated timestamp and writing to disk.
Activity
  • No human activity (comments, reviews, etc.) has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces valuable performance optimizations by implementing universal tool output truncation and in-memory caching for the chat recording service. These changes should significantly reduce disk I/O and prevent OOM errors with large repositories.

However, I've identified a critical performance issue in the implementation of writeConversation. The logic still performs a double JSON.stringify on every change, which negates some of the performance gains, especially for large conversation objects. I've provided a suggestion to fix this by performing the serialization only once.

Comment on lines 462 to 469
const currentContent = JSON.stringify(conversation, null, 2);
if (this.cachedLastConvData !== currentContent) {
// Only update the timestamp and re-stringify if something actually changed.
conversation.lastUpdated = new Date().toISOString();
const newContent = JSON.stringify(conversation, null, 2);
this.cachedLastConvData = newContent;
fs.writeFileSync(this.conversationFile, newContent);
const finalContent = JSON.stringify(conversation, null, 2);
this.cachedLastConvData = finalContent;
fs.writeFileSync(this.conversationFile, finalContent);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This implementation performs two JSON.stringify operations for every change to the conversation, which can be a significant performance bottleneck for large conversation histories, potentially leading to the OOM issues this PR aims to fix. The first stringify creates currentContent, and if changes are detected, a second stringify creates finalContent after updating the timestamp.

A more performant approach would be to stringify the object once and then replace the timestamp in the resulting string, avoiding the expensive second serialization.

      const currentContent = JSON.stringify(conversation, null, 2);
      if (this.cachedLastConvData !== currentContent) {
        // To avoid a second full stringification for a large conversation object,
        // we can replace just the timestamp in the already stringified content.
        // This is significantly more performant.
        const newTimestamp = new Date().toISOString();
        const finalContent = currentContent.replace(
          `"lastUpdated": "${conversation.lastUpdated}"`,
          `"lastUpdated": "${newTimestamp}"`
        );

        // Update the in-memory object's timestamp to stay in sync.
        conversation.lastUpdated = newTimestamp;

        this.cachedLastConvData = finalContent;
        fs.writeFileSync(this.conversationFile, finalContent);
      }

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

Size Change: +547 B (0%)

Total Size: 23.9 MB

ℹ️ View Unchanged
Filename Size Change
./bundle/gemini.js 23.9 MB +547 B (0%)
./bundle/sandbox-macos-permissive-closed.sb 1.03 kB 0 B
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-closed.sb 3.29 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B

compressed-size-action

@gemini-cli gemini-cli bot added the status/need-issue Pull requests that need to have an associated issue. label Feb 9, 2026
Copy link
Collaborator

@NTaylorMullen NTaylorMullen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend breaking this into two PRs one for memory one for the functionalit change

this.conversationFile = path.join(chatsDir, filename);

this.writeConversation({
const initialConversation: ConversationRecord = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that writeConversation now mutates the conversation passed in? If so we may want to change that behavior. Side effects from functions like that don't bode well typically

const toolName = call.request.name;
const callId = call.request.callId;

if (typeof content === 'string' && toolName === SHELL_TOOL_NAME) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bigger change that I'd recommend writing a test + behavioral eval around

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/need-issue Pull requests that need to have an associated issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants