perf(core): optimize chat recording and add universal tool truncation#18668
perf(core): optimize chat recording and add universal tool truncation#18668mattKorwel wants to merge 3 commits intomainfrom
Conversation
Summary of ChangesHello @mattKorwel, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant performance and memory optimizations to the core chat recording and tool execution logic. The primary goal is to prevent out-of-memory (OOM) crashes, especially when dealing with large repositories or extensive tool outputs, by implementing universal tool output truncation and enhancing the efficiency of conversation record management. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces valuable performance optimizations by implementing universal tool output truncation and in-memory caching for the chat recording service. These changes should significantly reduce disk I/O and prevent OOM errors with large repositories.
However, I've identified a critical performance issue in the implementation of writeConversation. The logic still performs a double JSON.stringify on every change, which negates some of the performance gains, especially for large conversation objects. I've provided a suggestion to fix this by performing the serialization only once.
| const currentContent = JSON.stringify(conversation, null, 2); | ||
| if (this.cachedLastConvData !== currentContent) { | ||
| // Only update the timestamp and re-stringify if something actually changed. | ||
| conversation.lastUpdated = new Date().toISOString(); | ||
| const newContent = JSON.stringify(conversation, null, 2); | ||
| this.cachedLastConvData = newContent; | ||
| fs.writeFileSync(this.conversationFile, newContent); | ||
| const finalContent = JSON.stringify(conversation, null, 2); | ||
| this.cachedLastConvData = finalContent; | ||
| fs.writeFileSync(this.conversationFile, finalContent); | ||
| } |
There was a problem hiding this comment.
This implementation performs two JSON.stringify operations for every change to the conversation, which can be a significant performance bottleneck for large conversation histories, potentially leading to the OOM issues this PR aims to fix. The first stringify creates currentContent, and if changes are detected, a second stringify creates finalContent after updating the timestamp.
A more performant approach would be to stringify the object once and then replace the timestamp in the resulting string, avoiding the expensive second serialization.
const currentContent = JSON.stringify(conversation, null, 2);
if (this.cachedLastConvData !== currentContent) {
// To avoid a second full stringification for a large conversation object,
// we can replace just the timestamp in the already stringified content.
// This is significantly more performant.
const newTimestamp = new Date().toISOString();
const finalContent = currentContent.replace(
`"lastUpdated": "${conversation.lastUpdated}"`,
`"lastUpdated": "${newTimestamp}"`
);
// Update the in-memory object's timestamp to stay in sync.
conversation.lastUpdated = newTimestamp;
this.cachedLastConvData = finalContent;
fs.writeFileSync(this.conversationFile, finalContent);
}|
Size Change: +547 B (0%) Total Size: 23.9 MB ℹ️ View Unchanged
|
NTaylorMullen
left a comment
There was a problem hiding this comment.
I'd recommend breaking this into two PRs one for memory one for the functionalit change
| this.conversationFile = path.join(chatsDir, filename); | ||
|
|
||
| this.writeConversation({ | ||
| const initialConversation: ConversationRecord = { |
There was a problem hiding this comment.
Does this mean that writeConversation now mutates the conversation passed in? If so we may want to change that behavior. Side effects from functions like that don't bode well typically
| const toolName = call.request.name; | ||
| const callId = call.request.callId; | ||
|
|
||
| if (typeof content === 'string' && toolName === SHELL_TOOL_NAME) { |
There was a problem hiding this comment.
This is a bigger change that I'd recommend writing a test + behavioral eval around
Summary
This PR introduces critical memory optimizations to the core chat recording and tool execution logic to prevent OOM crashes on large repositories.
Details
ToolExecutorto all tools (not just shell). Large outputs are offloaded to disk and replaced with a placeholder link in the history.Related Issues
Fixes the primary cause of out-of-memory error crashes reported during large-scale repository assessment.
How to Validate
lson a root directory with many files).~/.gemini/tmp/.../chats/remains a reasonable size.~/.gemini/tmp/.../tool-outputs/.npm test -w @google/gemini-cli-core -- src/services/chatRecordingService.test.tsto ensure core recording logic is intact.Pre-Merge Checklist