Fix pre-turn compaction context handling and remote compact edge cases#11234
Fix pre-turn compaction context handling and remote compact edge cases#11234charley-oai wants to merge 80 commits intomainfrom
Conversation
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3dc9adeb03
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
c6ba898 to
f596b18
Compare
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b23ccb549b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -356,24 +356,25 @@ async fn compact_resume_and_fork_preserve_model_history_view() { | |||
| "model": expected_model, | |||
There was a problem hiding this comment.
- Old expected order was roughly:
- permissions -> user_instructions -> environment_context -> last_real_user -> summary -> new_user
- New expected order is:
- last_real_user -> summary -> permissions -> user_instructions -> environment_context -> new_user
So the tests now expect:
- The compaction summary to stay attached to the compacted user history (right after the last real pre-
compaction user message). - Turn context reseeding to be placed above the next user message, not hoisted to the very front.
| let rollout_item = RolloutItem::Compacted(CompactedItem { | ||
| message: summary_text.clone(), | ||
| replacement_history: None, | ||
| replacement_history: Some(sess.clone_history().await.raw_items().to_vec()), |
There was a problem hiding this comment.
aligning more with remote compaction behavior
75f3b0d to
9e4233e
Compare
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f382ce9073
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| responses::ev_assistant_message("m1", "MANUAL_COMPACT_SUMMARY"), | ||
| responses::ev_completed_with_tokens("r1", 200), | ||
| ]); | ||
| responses::mount_sse_sequence(&server, vec![sse]).await; |
There was a problem hiding this comment.
This test now exercises the empty-history /compact path, which is intentionally a no-op at the
model/API layer.
- In codex-rs/app-server/tests/suite/v2/compaction.rs:202,
thread_compact_start_triggers_compaction_and_returns_empty_response starts a fresh thread with no user-turn boundary. - With current behavior, compaction emits started/completed thread items but does not send a /v1/responses
request. - The deleted block (mount_sse_sequence) created a strict mock expecting one SSE request; since no request
is sent, Wiremock fails on drop (“expected 1, got 0”), which is exactly the CI failure you pasted.
So it was removed to align the test with the new semantics: empty-start /compact should no-op remotely, not
hit the responses API.
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: df193389e4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0ce2420ad0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7eac781d90
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
7eac781 to
b774ea0
Compare
fbd8bc6 to
dc7ec6e
Compare
b774ea0 to
6a3bffe
Compare
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a3bffe31c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
edb5abd to
66f3861
Compare
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ac22cd3093
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b66c6e63fd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this |
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca84df34b8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
codex-rs/core/src/compact_remote.rs
Outdated
| new_history | ||
| .iter() | ||
| .rposition(|candidate| match (candidate, incoming_item) { | ||
| ( | ||
| ResponseItem::Message { | ||
| role: candidate_role, | ||
| content: candidate_content, | ||
| .. | ||
| }, | ||
| ResponseItem::Message { | ||
| role: incoming_role, | ||
| content: incoming_content, | ||
| .. | ||
| }, | ||
| ) => candidate_role == incoming_role && candidate_content == incoming_content, | ||
| ( |
There was a problem hiding this comment.
Preserve historical duplicates when stripping incoming echoes
remove_incoming_echoes_from_compacted_history removes the last item matching each incoming item by role/content. If remote compaction output does not actually echo the incoming item, but an older turn has identical text, this deletes legitimate history. Repeated user messages can be silently dropped after compaction, corrupting conversation context.
Useful? React with 👍 / 👎.
| @@ -203,11 +203,6 @@ async fn thread_compact_start_triggers_compaction_and_returns_empty_response() - | |||
| skip_if_no_network!(Ok(())); | |||
|
|
|||
| let server = responses::start_mock_server().await; | |||
There was a problem hiding this comment.
does test need renaming?
| &sess, | ||
| &turn_context, | ||
| auto_compact_limit, | ||
| &incoming_turn_items, |
There was a problem hiding this comment.
The combination of incoming_turn_items and pre_turn_context_items is very confusing. Especiallu considering incoming_turn_items == pre_turn_context_items + input
| sess.record_conversation_items(¤t_context, &update_items) | ||
| let has_user_input = !items.is_empty(); | ||
| if !has_user_input && !pre_turn_context_items.is_empty() { | ||
| // Empty-input UserTurn still needs these model-visible updates persisted now. |
There was a problem hiding this comment.
We were already persisting these items for empty user turn before. Why the extra condition?
| sess.take_startup_regular_task() | ||
| .await | ||
| .unwrap_or_default() | ||
| .with_pre_turn_context_items(pre_turn_context_items) |
There was a problem hiding this comment.
I really don't love how sometimes we directly inject history items and sometimes we carry them separately
| None, | ||
| ) | ||
| .await | ||
| { |
There was a problem hiding this comment.
There is a lot of new error reporting in many places. Should it be part of this PR and should it be done more centally?
| 04:message/user:<COMPACTION_SUMMARY>\nFIRST_MANUAL_SUMMARY | ||
| 00:message/user:first manual turn | ||
| 01:message/user:<COMPACTION_SUMMARY>\nFIRST_MANUAL_SUMMARY | ||
| 02:message/developer:<PERMISSIONS_INSTRUCTIONS> |
There was a problem hiding this comment.
didn't we want these pre summary?
There was a problem hiding this comment.
I guess not for manual compact.
| 01:message/user:<AGENTS_MD> | ||
| 02:message/user:<ENVIRONMENT_CONTEXT:cwd=<CWD>> | ||
| 03:message/user:<SUMMARIZATION_PROMPT> | ||
| Scenario: Manual /compact with no prior user turn is a no-op; follow-up turn carries canonical context and the new user message. |
There was a problem hiding this comment.
Do we special case the logic to skip compaction without user turns anywhere? Do we need to? Feels pretty rare.
| 03:message/user:<AGENTS_MD> | ||
| 04:message/user:<ENVIRONMENT_CONTEXT:cwd=<CWD>> | ||
| 05:message/user:<COMPACTION_SUMMARY>\nREMOTE_LATEST_SUMMARY | ||
| 02:message/user:<COMPACTION_SUMMARY>\nREMOTE_LATEST_SUMMARY |
There was a problem hiding this comment.
I don't think we'll ever have two summaries.
| if !update_items.is_empty() { | ||
| sess.record_conversation_items(¤t_context, &update_items) | ||
| let has_user_input = !items.is_empty(); | ||
| if !has_user_input && !pre_turn_context_items.is_empty() { |
There was a problem hiding this comment.
Is not content use turn a thing?
| incoming_user_tokens_estimate: i64, | ||
| auto_compact_limit: i64, | ||
| ) -> bool { | ||
| if auto_compact_limit == i64::MAX { |
There was a problem hiding this comment.
isn't this an effective noop?
|
Closing in favor of simpler approach |
Note: this PR is not vibe coded and I have carefully self reviewed.
Snapshot/shape test extraction now lives in a separate base PR: #11487.
This PR is stacked on top of
cc/compact-context-testsand contains the compaction/context runtime changes only.Summary
pre_turn_context_items+ latest user input), and if that compaction hits context-window limits we return an explicit oversized incoming-items error to the user.PreTurnCompactionOutcome::CompactedWithoutIncomingItemsas a reserved, test-only strategy for future model behavior; production pre-turn flow currently usesCompactedWithIncomingItems.previous_contextadvances:UserTurnTurnContextReinjection::ReinjectAboveLastRealUser) so unchanged instructions remain present if this is the only compaction pass before submission./compactnow no-ops when there is no user-turn boundary and marks initial context unseeded for the next turn so canonical context is reseeded before future user input.replacement_history, improving compact/resume/fork reconstruction consistency.Reviewer Focus
codex-rs/core/src/codex.rsrun_pre_turn_auto_compaction_if_neededpersist_pre_turn_items_for_compaction_outcomemaybe_run_previous_model_inline_compacthandlers::user_input_or_turn/run_turncodex-rs/core/src/compact.rsrun_compact_task_inner(trim/retry + protected tail semantics)build_compacted_history_with_limitprocess_compacted_historycodex-rs/core/src/compact_remote.rsrun_remote_compact_task_inner_impltrim_function_call_history_to_fit_context_windowcodex-rs/core/src/tasks/compact.rs/compactno-op + reseed semantics when no user-turn boundary existscodex-rs/core/tests/suite/compact.rscodex-rs/core/tests/suite/model_switching.rsCodex author
codex resume 019c40c5-7867-7843-b487-cce45a86a8bb