-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Improve system message token accounting in compaction and context validation #8955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Keep this PR in a mergeable state → Learn moreAll Green is an AI agent that automatically: ✅ Addresses code review comments ✅ Fixes failing CI checks ✅ Resolves merge conflicts |
2 similar comments
|
Keep this PR in a mergeable state → Learn moreAll Green is an AI agent that automatically: ✅ Addresses code review comments ✅ Fixes failing CI checks ✅ Resolves merge conflicts |
|
Keep this PR in a mergeable state → Learn moreAll Green is an AI agent that automatically: ✅ Addresses code review comments ✅ Fixes failing CI checks ✅ Resolves merge conflicts |
|
|
✅ Review Complete This is a solid PR that addresses an important issue with context length accounting. The changes systematically improve token accounting by including system messages and adding a safety buffer. Here's my review: ✅ Strengths
|
|
Reviewed the PR changes. No documentation updates needed. Reasoning:
The improvements to token accounting and system message handling are implementation details that enhance correctness without affecting how users interact with the CLI. |
|
Reviewed the PR changes. No documentation updates needed. Reasoning:
The improvements to token accounting and system message handling are implementation details that enhance correctness without affecting how users interact with the CLI. |
|
🤖 All Green agent started: View agent |
Add required systemMessage parameter to all processStreamingResponse calls in the test file to match the updated interface signature. Co-authored-by: nate <nate@continue.dev>
Reorder imports to comply with ESLint import/order rules: - Move vitest imports after core imports - Add empty line between import groups - Move services import before local imports Co-authored-by: nate <nate@continue.dev>
ESLint import/order rule requires a blank line between different import groups (parent vs sibling). Co-authored-by: nate <nate@continue.dev>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 issues found across 10 files
Prompt for AI agents (all 4 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts:109">
P1: The post-tool “force compaction” path never runs when overflow is caused solely by the large system message, because handleAutoCompaction short-circuits before compaction whenever `shouldAutoCompact` is false for the raw history. This makes the new overflow handling throw even though compaction could have resolved it.</violation>
<violation number="2" location="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts:119">
P1: Successful post-tool compaction still throws whenever `services.chatHistory` is unavailable, because the code conflates “no service” with “compaction failed” instead of falling back to the returned compacted history.</violation>
</file>
<file name="extensions/cli/src/stream/streamChatResponse.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.ts:470">
P2: `handlePostToolValidation` throws whenever the ChatHistory service is unavailable, so the new call causes forced post-tool compaction to fail in headless contexts that previously worked.</violation>
</file>
<file name="extensions/cli/src/compaction.ts">
<violation number="1" location="extensions/cli/src/compaction.ts:68">
P2: `systemMessageTokens` is already counted inside `historyForCompaction`, so subtracting it again artificially shrinks the available token budget and causes unnecessary pruning of user history.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| systemMessage, | ||
| }); | ||
|
|
||
| if (wasCompacted && chatHistorySvc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Successful post-tool compaction still throws whenever services.chatHistory is unavailable, because the code conflates “no service” with “compaction failed” instead of falling back to the returned compacted history.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts, line 119:
<comment>Successful post-tool compaction still throws whenever `services.chatHistory` is unavailable, because the code conflates “no service” with “compaction failed” instead of falling back to the returned compacted history.</comment>
<file context>
@@ -0,0 +1,197 @@
+ systemMessage,
+ });
+
+ if (wasCompacted && chatHistorySvc) {
+ chatHistorySvc.setHistory(compactedHistory);
+ chatHistory = chatHistorySvc.getHistory();
</file context>
✅ Addressed in 14dd72e
| }); | ||
|
|
||
| // Force compaction (compaction now accounts for system message during pruning) | ||
| const { wasCompacted, chatHistory: compactedHistory } = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: The post-tool “force compaction” path never runs when overflow is caused solely by the large system message, because handleAutoCompaction short-circuits before compaction whenever shouldAutoCompact is false for the raw history. This makes the new overflow handling throw even though compaction could have resolved it.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts, line 109:
<comment>The post-tool “force compaction” path never runs when overflow is caused solely by the large system message, because handleAutoCompaction short-circuits before compaction whenever `shouldAutoCompact` is false for the raw history. This makes the new overflow handling throw even though compaction could have resolved it.</comment>
<file context>
@@ -0,0 +1,197 @@
+ });
+
+ // Force compaction (compaction now accounts for system message during pruning)
+ const { wasCompacted, chatHistory: compactedHistory } =
+ await handleAutoCompaction(chatHistory, model, llmApi, {
+ isHeadless,
</file context>
| } | ||
| } | ||
| // After tool execution, validate that we haven't exceeded context limit | ||
| chatHistory = await handlePostToolValidation(toolCalls, chatHistory, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: handlePostToolValidation throws whenever the ChatHistory service is unavailable, so the new call causes forced post-tool compaction to fail in headless contexts that previously worked.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.ts, line 470:
<comment>`handlePostToolValidation` throws whenever the ChatHistory service is unavailable, so the new call causes forced post-tool compaction to fail in headless contexts that previously worked.</comment>
<file context>
@@ -447,33 +466,29 @@ export async function streamChatResponse(
- }
- }
+ // After tool execution, validate that we haven't exceeded context limit
+ chatHistory = await handlePostToolValidation(toolCalls, chatHistory, {
+ model,
+ llmApi,
</file context>
extensions/cli/src/compaction.ts
Outdated
| // Account for system message AND safety buffer | ||
| const SAFETY_BUFFER = 100; | ||
| const availableForInput = | ||
| contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: systemMessageTokens is already counted inside historyForCompaction, so subtracting it again artificially shrinks the available token budget and causes unnecessary pruning of user history.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/compaction.ts, line 68:
<comment>`systemMessageTokens` is already counted inside `historyForCompaction`, so subtracting it again artificially shrinks the available token budget and causes unnecessary pruning of user history.</comment>
<file context>
@@ -56,7 +61,11 @@ export async function compactChatHistory(
+ // Account for system message AND safety buffer
+ const SAFETY_BUFFER = 100;
+ const availableForInput =
+ contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER;
// Check if we need to prune to fit within context
</file context>
✅ Addressed in 14dd72e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 issues found across 11 files
Prompt for AI agents (all 4 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="extensions/cli/src/stream/streamChatResponse.systemMessage.test.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.systemMessage.test.ts:197">
P2: This test asserts that `processStreamingResponse` rejects even though the total tokens exactly equal the context limit with the safety buffer, so validation should succeed. Change the test to expect a successful response (and mock `chatCompletionStream`) instead of `.rejects.toThrow()`.</violation>
</file>
<file name="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts:119">
P1: Compaction after tool calls immediately throws whenever the chat history service is unavailable, even if compaction succeeded; use the returned compactedHistory as a fallback instead of treating it as a failure.</violation>
</file>
<file name="extensions/cli/src/compaction.ts">
<violation number="1" location="extensions/cli/src/compaction.ts:68">
P2: System message tokens are already counted inside `historyForCompaction`, so subtracting `systemMessageTokens` again shrinks the available input budget by an entire system prompt and causes compaction to over‑prune. Remove the extra subtraction so only the safety buffer is reserved.</violation>
</file>
<file name="extensions/cli/src/util/tokenizer.ts">
<violation number="1" location="extensions/cli/src/util/tokenizer.ts:80">
P2: `toolState.output` names are never sent to the model, so counting `encode(item.name)` overestimates input tokens and causes unnecessary compaction/validation failures.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| abortController, | ||
| systemMessage, | ||
| }), | ||
| ).rejects.toThrow(); // Will fail because we can't prune enough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: This test asserts that processStreamingResponse rejects even though the total tokens exactly equal the context limit with the safety buffer, so validation should succeed. Change the test to expect a successful response (and mock chatCompletionStream) instead of .rejects.toThrow().
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.systemMessage.test.ts, line 197:
<comment>This test asserts that `processStreamingResponse` rejects even though the total tokens exactly equal the context limit with the safety buffer, so validation should succeed. Change the test to expect a successful response (and mock `chatCompletionStream`) instead of `.rejects.toThrow()`.</comment>
<file context>
@@ -0,0 +1,199 @@
+ abortController,
+ systemMessage,
+ }),
+ ).rejects.toThrow(); // Will fail because we can't prune enough
+ });
+});
</file context>
| if (wasCompacted && chatHistorySvc) { | ||
| chatHistorySvc.setHistory(compactedHistory); | ||
| chatHistory = chatHistorySvc.getHistory(); | ||
|
|
||
| // Verify compaction brought us under the limit | ||
| const postCompactionValidation = validateContextLength( | ||
| [postToolSystemItem, ...compactedHistory], | ||
| model, | ||
| SAFETY_BUFFER, | ||
| ); | ||
|
|
||
| if (!postCompactionValidation.isValid) { | ||
| logger.error( | ||
| "Compaction failed to bring context under limit, stopping execution", | ||
| { | ||
| inputTokens: postCompactionValidation.inputTokens, | ||
| contextLimit: postCompactionValidation.contextLimit, | ||
| }, | ||
| ); | ||
| throw new Error( | ||
| `Context limit exceeded even after compaction: ${postCompactionValidation.error}`, | ||
| ); | ||
| } | ||
|
|
||
| logger.info("Successfully compacted after tool overflow", { | ||
| inputTokens: postCompactionValidation.inputTokens, | ||
| contextLimit: postCompactionValidation.contextLimit, | ||
| }); | ||
| } else { | ||
| // Compaction failed, cannot continue | ||
| logger.error("Failed to compact history after tool execution overflow"); | ||
| throw new Error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Compaction after tool calls immediately throws whenever the chat history service is unavailable, even if compaction succeeded; use the returned compactedHistory as a fallback instead of treating it as a failure.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts, line 119:
<comment>Compaction after tool calls immediately throws whenever the chat history service is unavailable, even if compaction succeeded; use the returned compactedHistory as a fallback instead of treating it as a failure.</comment>
<file context>
@@ -0,0 +1,197 @@
+ systemMessage,
+ });
+
+ if (wasCompacted && chatHistorySvc) {
+ chatHistorySvc.setHistory(compactedHistory);
+ chatHistory = chatHistorySvc.getHistory();
</file context>
| if (wasCompacted && chatHistorySvc) { | |
| chatHistorySvc.setHistory(compactedHistory); | |
| chatHistory = chatHistorySvc.getHistory(); | |
| // Verify compaction brought us under the limit | |
| const postCompactionValidation = validateContextLength( | |
| [postToolSystemItem, ...compactedHistory], | |
| model, | |
| SAFETY_BUFFER, | |
| ); | |
| if (!postCompactionValidation.isValid) { | |
| logger.error( | |
| "Compaction failed to bring context under limit, stopping execution", | |
| { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }, | |
| ); | |
| throw new Error( | |
| `Context limit exceeded even after compaction: ${postCompactionValidation.error}`, | |
| ); | |
| } | |
| logger.info("Successfully compacted after tool overflow", { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }); | |
| } else { | |
| // Compaction failed, cannot continue | |
| logger.error("Failed to compact history after tool execution overflow"); | |
| throw new Error( | |
| if (wasCompacted) { | |
| if (chatHistorySvc) { | |
| chatHistorySvc.setHistory(compactedHistory); | |
| chatHistory = chatHistorySvc.getHistory(); | |
| } else { | |
| chatHistory = [...compactedHistory]; | |
| } | |
| // Verify compaction brought us under the limit | |
| const postCompactionValidation = validateContextLength( | |
| [postToolSystemItem, ...chatHistory], | |
| model, | |
| SAFETY_BUFFER, | |
| ); | |
| if (!postCompactionValidation.isValid) { | |
| logger.error( | |
| "Compaction failed to bring context under limit, stopping execution", | |
| { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }, | |
| ); | |
| throw new Error( | |
| `Context limit exceeded even after compaction: ${postCompactionValidation.error}`, | |
| ); | |
| } | |
| logger.info("Successfully compacted after tool overflow", { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }); | |
| } else { | |
| // Compaction failed, cannot continue | |
| logger.error("Failed to compact history after tool execution overflow"); | |
| throw new Error( | |
| "Context limit exceeded and compaction failed. Unable to continue.", | |
| ); | |
| } |
✅ Addressed in 14dd72e
extensions/cli/src/compaction.ts
Outdated
| // Account for system message AND safety buffer | ||
| const SAFETY_BUFFER = 100; | ||
| const availableForInput = | ||
| contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: System message tokens are already counted inside historyForCompaction, so subtracting systemMessageTokens again shrinks the available input budget by an entire system prompt and causes compaction to over‑prune. Remove the extra subtraction so only the safety buffer is reserved.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/compaction.ts, line 68:
<comment>System message tokens are already counted inside `historyForCompaction`, so subtracting `systemMessageTokens` again shrinks the available input budget by an entire system prompt and causes compaction to over‑prune. Remove the extra subtraction so only the safety buffer is reserved.</comment>
<file context>
@@ -56,7 +61,11 @@ export async function compactChatHistory(
+ // Account for system message AND safety buffer
+ const SAFETY_BUFFER = 100;
+ const availableForInput =
+ contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER;
// Check if we need to prune to fit within context
</file context>
| contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER; | |
| contextLimit - reservedForOutput - SAFETY_BUFFER; |
✅ Addressed in 14dd72e
extensions/cli/src/util/tokenizer.ts
Outdated
| if (item.content) { | ||
| tokenCount += encode(item.content).length; | ||
| } | ||
| if (item.name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: toolState.output names are never sent to the model, so counting encode(item.name) overestimates input tokens and causes unnecessary compaction/validation failures.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/util/tokenizer.ts, line 80:
<comment>`toolState.output` names are never sent to the model, so counting `encode(item.name)` overestimates input tokens and causes unnecessary compaction/validation failures.</comment>
<file context>
@@ -22,6 +22,69 @@ export function getModelContextLimit(model: ModelConfig): number {
+ if (item.content) {
+ tokenCount += encode(item.content).length;
+ }
+ if (item.name) {
+ tokenCount += encode(item.name).length;
+ }
</file context>
✅ Addressed in 14dd72e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 issues found across 11 files
Prompt for AI agents (all 3 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts:119">
P1: Forced post-tool compaction is treated as a failure whenever `services.chatHistory` is unavailable, causing this path to throw even though compaction succeeded and a valid in-memory history exists. Provide a fallback that updates `chatHistory` locally when the service cannot be used instead of throwing.</violation>
</file>
<file name="extensions/cli/src/compaction.ts">
<violation number="1" location="extensions/cli/src/compaction.ts:67">
P2: System message tokens are removed twice, causing over-aggressive compaction and loss of useful history.</violation>
</file>
<file name="extensions/cli/src/stream/streamChatResponse.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.ts:480">
P1: `handleNormalAutoCompaction` drops the compacted history whenever the chat-history service is unavailable, so the 80 % safety compaction no longer runs in those environments.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| if (wasCompacted && chatHistorySvc) { | ||
| chatHistorySvc.setHistory(compactedHistory); | ||
| chatHistory = chatHistorySvc.getHistory(); | ||
|
|
||
| // Verify compaction brought us under the limit | ||
| const postCompactionValidation = validateContextLength( | ||
| [postToolSystemItem, ...compactedHistory], | ||
| model, | ||
| SAFETY_BUFFER, | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Forced post-tool compaction is treated as a failure whenever services.chatHistory is unavailable, causing this path to throw even though compaction succeeded and a valid in-memory history exists. Provide a fallback that updates chatHistory locally when the service cannot be used instead of throwing.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts, line 119:
<comment>Forced post-tool compaction is treated as a failure whenever `services.chatHistory` is unavailable, causing this path to throw even though compaction succeeded and a valid in-memory history exists. Provide a fallback that updates `chatHistory` locally when the service cannot be used instead of throwing.</comment>
<file context>
@@ -0,0 +1,197 @@
+ systemMessage,
+ });
+
+ if (wasCompacted && chatHistorySvc) {
+ chatHistorySvc.setHistory(compactedHistory);
+ chatHistory = chatHistorySvc.getHistory();
</file context>
| if (wasCompacted && chatHistorySvc) { | |
| chatHistorySvc.setHistory(compactedHistory); | |
| chatHistory = chatHistorySvc.getHistory(); | |
| // Verify compaction brought us under the limit | |
| const postCompactionValidation = validateContextLength( | |
| [postToolSystemItem, ...compactedHistory], | |
| model, | |
| SAFETY_BUFFER, | |
| ); | |
| if (wasCompacted) { | |
| if ( | |
| typeof chatHistorySvc?.isReady === "function" && | |
| chatHistorySvc.isReady() | |
| ) { | |
| chatHistorySvc.setHistory(compactedHistory); | |
| chatHistory = chatHistorySvc.getHistory(); | |
| } else { | |
| chatHistory = [...compactedHistory]; | |
| } | |
| // Verify compaction brought us under the limit | |
| const postCompactionValidation = validateContextLength( | |
| [postToolSystemItem, ...chatHistory], | |
| model, | |
| SAFETY_BUFFER, | |
| ); |
✅ Addressed in 14dd72e
extensions/cli/src/compaction.ts
Outdated
| const availableForInput = | ||
| contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: System message tokens are removed twice, causing over-aggressive compaction and loss of useful history.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/compaction.ts, line 67:
<comment>System message tokens are removed twice, causing over-aggressive compaction and loss of useful history.</comment>
<file context>
@@ -56,7 +61,11 @@ export async function compactChatHistory(
+
+ // Account for system message AND safety buffer
+ const SAFETY_BUFFER = 100;
+ const availableForInput =
+ contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER;
</file context>
| const availableForInput = | |
| contextLimit - reservedForOutput - systemMessageTokens - SAFETY_BUFFER; | |
| const hasSystemMessage = historyToUse.some( | |
| (item) => item.message.role === "system", | |
| ); | |
| const reservedForSystem = hasSystemMessage ? 0 : systemMessageTokens; | |
| const availableForInput = | |
| contextLimit - reservedForOutput - reservedForSystem - SAFETY_BUFFER; |
✅ Addressed in 14dd72e
| }); | ||
|
|
||
| // Normal auto-compaction check at 80% threshold | ||
| chatHistory = await handleNormalAutoCompaction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: handleNormalAutoCompaction drops the compacted history whenever the chat-history service is unavailable, so the 80 % safety compaction no longer runs in those environments.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.ts, line 480:
<comment>`handleNormalAutoCompaction` drops the compacted history whenever the chat-history service is unavailable, so the 80 % safety compaction no longer runs in those environments.</comment>
<file context>
@@ -447,33 +466,29 @@ export async function streamChatResponse(
+ });
+
+ // Normal auto-compaction check at 80% threshold
+ chatHistory = await handleNormalAutoCompaction(
+ chatHistory,
+ shouldContinue,
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 11 files
Prompt for AI agents (all 1 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts:119">
P2: Post-tool compaction now throws whenever chatHistorySvc isn’t available even though handleAutoCompaction already returned compactedHistory; this makes tool-overflow recovery fail in headless/early cases where the service isn’t ready. Treat the service as optional: update it when available, fall back to the locally compacted history, and only throw when compaction itself failed.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| if (wasCompacted && chatHistorySvc) { | ||
| chatHistorySvc.setHistory(compactedHistory); | ||
| chatHistory = chatHistorySvc.getHistory(); | ||
|
|
||
| // Verify compaction brought us under the limit | ||
| const postCompactionValidation = validateContextLength( | ||
| [postToolSystemItem, ...compactedHistory], | ||
| model, | ||
| SAFETY_BUFFER, | ||
| ); | ||
|
|
||
| if (!postCompactionValidation.isValid) { | ||
| logger.error( | ||
| "Compaction failed to bring context under limit, stopping execution", | ||
| { | ||
| inputTokens: postCompactionValidation.inputTokens, | ||
| contextLimit: postCompactionValidation.contextLimit, | ||
| }, | ||
| ); | ||
| throw new Error( | ||
| `Context limit exceeded even after compaction: ${postCompactionValidation.error}`, | ||
| ); | ||
| } | ||
|
|
||
| logger.info("Successfully compacted after tool overflow", { | ||
| inputTokens: postCompactionValidation.inputTokens, | ||
| contextLimit: postCompactionValidation.contextLimit, | ||
| }); | ||
| } else { | ||
| // Compaction failed, cannot continue | ||
| logger.error("Failed to compact history after tool execution overflow"); | ||
| throw new Error( | ||
| "Context limit exceeded and compaction failed. Unable to continue.", | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Post-tool compaction now throws whenever chatHistorySvc isn’t available even though handleAutoCompaction already returned compactedHistory; this makes tool-overflow recovery fail in headless/early cases where the service isn’t ready. Treat the service as optional: update it when available, fall back to the locally compacted history, and only throw when compaction itself failed.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At extensions/cli/src/stream/streamChatResponse.compactionHelpers.ts, line 119:
<comment>Post-tool compaction now throws whenever chatHistorySvc isn’t available even though handleAutoCompaction already returned compactedHistory; this makes tool-overflow recovery fail in headless/early cases where the service isn’t ready. Treat the service as optional: update it when available, fall back to the locally compacted history, and only throw when compaction itself failed.</comment>
<file context>
@@ -0,0 +1,197 @@
+ systemMessage,
+ });
+
+ if (wasCompacted && chatHistorySvc) {
+ chatHistorySvc.setHistory(compactedHistory);
+ chatHistory = chatHistorySvc.getHistory();
</file context>
| if (wasCompacted && chatHistorySvc) { | |
| chatHistorySvc.setHistory(compactedHistory); | |
| chatHistory = chatHistorySvc.getHistory(); | |
| // Verify compaction brought us under the limit | |
| const postCompactionValidation = validateContextLength( | |
| [postToolSystemItem, ...compactedHistory], | |
| model, | |
| SAFETY_BUFFER, | |
| ); | |
| if (!postCompactionValidation.isValid) { | |
| logger.error( | |
| "Compaction failed to bring context under limit, stopping execution", | |
| { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }, | |
| ); | |
| throw new Error( | |
| `Context limit exceeded even after compaction: ${postCompactionValidation.error}`, | |
| ); | |
| } | |
| logger.info("Successfully compacted after tool overflow", { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }); | |
| } else { | |
| // Compaction failed, cannot continue | |
| logger.error("Failed to compact history after tool execution overflow"); | |
| throw new Error( | |
| "Context limit exceeded and compaction failed. Unable to continue.", | |
| ); | |
| } | |
| if (wasCompacted) { | |
| if (chatHistorySvc) { | |
| chatHistorySvc.setHistory(compactedHistory); | |
| chatHistory = chatHistorySvc.getHistory(); | |
| } else { | |
| chatHistory = [...compactedHistory]; | |
| } | |
| // Verify compaction brought us under the limit | |
| const postCompactionValidation = validateContextLength( | |
| [postToolSystemItem, ...chatHistory], | |
| model, | |
| SAFETY_BUFFER, | |
| ); | |
| if (!postCompactionValidation.isValid) { | |
| logger.error( | |
| "Compaction failed to bring context under limit, stopping execution", | |
| { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }, | |
| ); | |
| throw new Error( | |
| `Context limit exceeded even after compaction: ${postCompactionValidation.error}`, | |
| ); | |
| } | |
| logger.info("Successfully compacted after tool overflow", { | |
| inputTokens: postCompactionValidation.inputTokens, | |
| contextLimit: postCompactionValidation.contextLimit, | |
| }); | |
| } else { | |
| // Compaction failed, cannot continue | |
| logger.error("Failed to compact history after tool execution overflow"); | |
| throw new Error( | |
| "Context limit exceeded and compaction failed. Unable to continue.", | |
| ); | |
| } |
✅ Addressed in 14dd72e
Address code review feedback: 1. Add fallback to use compacted history directly when ChatHistoryService is unavailable (fixes P1 issue in handlePostToolValidation and handleNormalAutoCompaction) 2. Prevent double-counting of system message tokens by checking if system message is already in history before subtracting tokens 3. Remove tool output name from token count since it's not sent to model This ensures compaction works correctly in headless contexts and provides more accurate token estimation. Co-authored-by: nate <nate@continue.dev>
|
Thanks @cubic-dev-ai for the detailed review! I've addressed the issues:
All fixes are in commit 14dd72e. |
|
🎉 This PR is included in version 1.33.1 🎉 The release is available on: Your semantic-release bot 📦🚀 |
|
🎉 This PR is included in version 1.8.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
|
🎉 This PR is included in version 1.37.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
|
🎉 This PR is included in version 1.7.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Summary by cubic
Improved token accounting by including the system message in compaction and context validation, with a safety buffer, to prevent context overflows—especially after tool calls. Also streamlined auto-compaction with helper utilities and updated APIs for clearer options.
New Features
Migration
Written for commit 14dd72e. Summary will update automatically on new commits.