Skip to content

Merge upstream and adopt Kontext PKCE integration flow#2

Merged
michiosw merged 685 commits intokontext-devfrom
codex/kontext-pkce-integration
Feb 7, 2026
Merged

Merge upstream and adopt Kontext PKCE integration flow#2
michiosw merged 685 commits intokontext-devfrom
codex/kontext-pkce-integration

Conversation

@michiosw
Copy link

@michiosw michiosw commented Feb 7, 2026

Summary

  • merge latest upstream/main into this fork branch
  • switch Codex Kontext integration from legacy token helper calls to KontextDevClient
  • authenticate via PKCE + resource token exchange and attach MCP using static Authorization header
  • open Kontext integration connect page only when browser auth occurred in this session
  • update fork docs/config snippets to the new [kontext-dev] PKCE-based settings
  • pin workspace kontext-dev dependency to the new SDK commit (947bbda92b14d8d818d08c8de0235cd2c66c6842)

Validation

  • CARGO_NET_GIT_FETCH_WITH_CLI=true cargo test -p codex-core --no-run
  • CARGO_NET_GIT_FETCH_WITH_CLI=true cargo check -p codex-cli

Open with Devin

jif-oai and others added 30 commits January 30, 2026 11:13
Session renaming:
- `/rename my_session`
- `/rename` without arg and passing an argument in `customViewPrompt`
- AppExitInfo shows resume hint using the session name if set instead of
uuid, defaults to uuid if not set
- Names are stored in `CODEX_HOME/sessions.jsonl`

Session resuming:
- codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry
matching the name and resumes the session

---------

Co-authored-by: jif-oai <jif@openai.com>
Load requirements from Codex Backend. It only does this for enterprise
customers signed in with ChatGPT.

Todo in follow-up PRs:
* Add to app-server and exec too
* Switch from fail-open to fail-closed on failure
…ai#10208)

Previously, `CodexAuth` was defined as follows:


https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L39-L46

But if you looked at its constructors, we had creation for
`AuthMode::ApiKey` where `storage` was built using a nonsensical path
(`PathBuf::new()`) and `auth_dot_json` was `None`:


https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L212-L220

By comparison, when `AuthMode::ChatGPT` was used, `api_key` was always
`None`:


https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L665-L671

openai#10012 took things further because
it introduced a new `ChatgptAuthTokens` variant to `AuthMode`, which is
important in when invoking `account/login/start` via the app server, but
most logic _internal_ to the app server should just reason about two
`AuthMode` variants: `ApiKey` and `ChatGPT`.

This PR tries to clean things up as follows:

- `LoginAccountParams` and `AuthMode` in `codex-rs/app-server-protocol/`
both continue to have the `ChatgptAuthTokens` variant, though it is used
exclusively for the on-the-wire messaging.
- `codex-rs/core/src/auth.rs` now has its own `AuthMode` enum, which
only has two variants: `ApiKey` and `ChatGPT`.
- `CodexAuth` has been changed from a struct to an enum. It is a
disjoint union where each variant (`ApiKey`, `ChatGpt`, and
`ChatGptAuthTokens`) have only the associated fields that make sense for
that variant.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10208).
* openai#10224
* __->__ openai#10208
## Summary
Let users start opting in to trying out personalities

## Testing
- [x] existing tests pass
## Summary
Let's start getting feedback on this feature 😅 

## Testing
- [x] existing tests pass
`requirements.toml` should be able to specify rules which always run. 

My intention here was that these rules could only ever be restrictive,
which means the decision can be "prompt" or "forbidden" but never
"allow". A requirement of "you must always allow this command" didn't
make sense to me, but happy to be gaveled otherwise.

Rules already applies the most restrictive decision, so we can safely
merge these with rules found in other config folders.
## Summary
- Tightens Plan Mode to encourage exploration-first behavior and more
back-and-forth alignment.
- Adds a required TL;DR checkpoint before drafting the full plan.
- Clarifies client behavior that can cause premature “Implement this
plan?” prompts.

## What changed
- Require at least one targeted non-mutating exploration pass before the
first user question.
- Insert a TL;DR checkpoint between Phase 2 (intent) and Phase 3
(implementation).
- TL;DR checkpoint guidance:
  - Label: “Proposed Plan (TL;DR)”
  - Format: 3–5 bullets using `- `
  - Options: exactly one option, “Approve”
- `isOther: true`, with explicit guidance that “None of the above” is
the edit path in the current UI.
- Require the final plan to include a TL;DR consistent with the approved
checkpoint.

## Why
- In Plan Mode, any normal assistant message at turn completion is
treated as plan content by the client. This can trigger premature
“Implement this plan?” prompts.
- The TL;DR checkpoint aligns on direction before Codex drafts a long,
decision-complete plan.

## Testing
- Manual: built the local CLI and verified the flow now explores first,
presents a TL;DR checkpoint, and only drafts the full plan after
approval.

---------

Co-authored-by: Nick Baumann <@openai.com>
…penai#9786)

## Summary
- Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
tags from normal assistant output.
- Persist plan items and rebuild them on resume so proposed plans show
in thread history.
- Wire plan items/deltas through app-server protocol v2 and render a
dedicated proposed-plan view in the TUI, including the “Implement this
plan?” prompt only when a plan item is present.

## Changes

### Core (`codex-rs/core`)
- Added a generic, line-based tag parser that buffers each line until it
can disprove a tag prefix; implements auto-close on `finish()` for
unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
- Refactored proposed plan parsing to wrap the generic parser.
`codex-rs/core/src/proposed_plan_parser.rs`
- In plan mode, stream assistant deltas as:
  - **Normal text** → `AgentMessageContentDelta`
  - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
  (`codex-rs/core/src/codex.rs`)
- Final plan item content is derived from the completed assistant
message (authoritative), not necessarily the concatenated deltas.
- Strips `<proposed_plan>` blocks from assistant text in plan mode so
tags don’t appear in normal messages.
(`codex-rs/core/src/stream_events_utils.rs`)
- Persist `ItemCompleted` events only for plan items for rollout replay.
(`codex-rs/core/src/rollout/policy.rs`)
- Guard `update_plan` tool in Plan Mode with a clear error message.
(`codex-rs/core/src/tools/handlers/plan.rs`)
- Updated Plan Mode prompt to:  
  - keep `<proposed_plan>` out of non-final reasoning/preambles  
  - require exact tag formatting  
  - allow only one `<proposed_plan>` block per turn  
  (`codex-rs/core/templates/collaboration_mode/plan.md`)

### Protocol / App-server protocol
- Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
(`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
- Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
EXPERIMENTAL markers and note that deltas may not match the final plan
item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
- Added plan delta route in app-server protocol common mapping.
(`codex-rs/app-server-protocol/src/protocol/common.rs`)
- Rebuild plan items from persisted `ItemCompleted` events on resume.
(`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)

### App-server
- Forward plan deltas to v2 clients and map core plan items to v2 plan
items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
`codex-rs/app-server/src/codex_message_processor.rs`)
- Added v2 plan item tests.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

### TUI
- Added a dedicated proposed plan history cell with special background
and padding, and moved “• Proposed Plan” outside the highlighted block.
(`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
- Only show “Implement this plan?” when a plan item exists.
(`codex-rs/tui/src/chatwidget.rs`,
`codex-rs/tui/src/chatwidget/tests.rs`)

<img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
/>

### Docs / Misc
- Updated protocol docs to mention plan deltas.
(`codex-rs/docs/protocol_v1.md`)
- Minor plumbing updates in exec/debug clients to tolerate plan deltas.
(`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)

## Tests
- Added core integration tests:
  - Plan mode strips plan from agent messages.
  - Missing `</proposed_plan>` closes at end-of-message.  
  (`codex-rs/core/tests/suite/items.rs`)
- Added unit tests for generic tag parser (prefix buffering, non-tag
lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
- Existing app-server plan item tests in v2.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

## Notes / Behavior
- Plan output no longer appears in standard assistant text in Plan Mode;
it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
- The final plan item content is authoritative and may diverge from
streamed deltas (documented as experimental).
- Reasoning summaries are not filtered; prompt instructs the model not
to include `<proposed_plan>` outside the final plan message.

## Codex Author
`codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
Title
Hide Code mode footer label/cycle hint; add Plan footer-collapse
snapshots

Summary
- Keep Code mode internal naming but suppress the footer mode label +
cycle hint when Code is active.
- Only show the cycle hint when a non‑Code mode indicator is present.
- Add Plan-mode footer collapse snapshot coverage (empty + queued,
across widths) and update existing footer collapse snapshots for the new
Code behavior.

Notes
- The test run currently fails in codex-cloud-requirements on
origin/main due to a stale auth.mode field; no fix is included in this
PR to keep the diff minimal.

Codex author
`codex resume 019c0296-cfd4-7193-9b0a-6949048e4546`
When using ChatGPT in names of types, we should be consistent, so this
renames some types with `ChatGpt` in the name to `Chatgpt`. From
https://rust-lang.github.io/api-guidelines/naming.html:

> In `UpperCamelCase`, acronyms and contractions of compound words count
as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize`
or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and
contractions are lower-cased: `is_xid_start`.

This PR updates existing uses of `ChatGpt` and changes them to
`Chatgpt`. Though in all cases where it could affect the wire format, I
visually inspected that we don't change anything there. That said, this
_will_ change the codegen because it will affect the spelling of type
names.

For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in
`app-server-protocol`, but the wire format is still `"chatgpt"`.

This PR also updates a number of types in `codex-rs/core/src/auth.rs`.
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
## Summary
- align proposed plan background with popup surface color by reusing
`user_message_bg`
- remove the custom blue-tinted plan background

<img width="1572" height="1568" alt="image"
src="https://github.com/user-attachments/assets/63a5341e-4342-4c07-b6b0-c4350c3b2639"
/>
Summary:
- Fixes issue openai#9932: openai#9932
- Prevents `$CODEX_HOME` (typically `~/.codex`) from being discovered as
a project `.codex` layer by skipping it during project layer traversal.
We compare both normalized absolute paths and best-effort canonicalized
paths to handle symlinks.
- Adds regression tests for home-directory invocation and for the case
where `CODEX_HOME` points to a project `.codex` directory (e.g.,
worktrees/editor integrations).

Testing:
- `cargo build -p codex-cli --bin codex`
- `cargo build -p codex-rmcp-client --bin test_stdio_server`
- `cargo test -p codex-core`
- `cargo test --all-features`
- Manual: ran `target/debug/codex` from `~` and confirmed the
disabled-folder warning and trust prompt no longer appear.
Fixes openai#9559.

When `shell_snapshot` runs, it may execute user startup files (e.g.
`.bashrc`). If those files read from stdin (or if stdin is an
interactive TTY under job control), the snapshot subprocess can block or
receive `SIGTTIN` (as reported over SSH).

This change explicitly sets `stdin` to `Stdio::null()` for the snapshot
subprocess, so it can't read from the terminal.

Regression test added that would hang/timeout without this change.
Tests: `ulimit -n 4096 && cargo test -p codex-core`.

cc @dongdongbh @etraut-openai

---------

Co-authored-by: Skylar Graika <sgraika127@gmail.com>
`/permissions` is the replacement. `/approvals` still available when
typing.
Extend the test for dev version
Instead of a separate walker for each root in a multi-root walk, use a
single walker.
seeing issues with azure after default-enabling web search: openai#10071,
openai#10257.

need to work with azure to fix api-side, for now turning off
default-enable of web_search for azure.

diff is big because i moved logic to reuse
jif-oai and others added 28 commits February 6, 2026 14:41
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
Summary
- add a `required` flag for MCP servers everywhere config/CLI data is
touched so mandatory helpers can be round-tripped
- have `codex exec` and `codex app-server` thread start/resume fail fast
when required MCPs fail to initialize
This is no longer needed because it's on by default
## Summary

This PR fixes a UI/streaming race when nudged or steer-enabled messages
are queued during an active Plan stream.

Previously, `submit_user_message_with_mode` switched collaboration mode
immediately (via `set_collaboration_mask`) even when the message was
queued. If that happened mid-Plan stream, `active_mode_kind` could flip
away from Plan before the turn finished, causing subsequent
`on_plan_delta` updates to be ignored in the UI.

Now, mode switching is deferred until the queued message is actually
submitted.

## What changed

- Added a per-message deferred mode override on `UserMessage`:
  - `collaboration_mode_override: Option<CollaborationModeMask>`
- Updated `submit_user_message_with_mode` to:
  - create a `UserMessage` carrying the mode override
- queue or submit that message without mutating global mode immediately
- Updated `submit_user_message` to:
- apply `collaboration_mode_override` just before constructing/sending
`Op::UserTurn`
- Kept queueing condition scoped to active Plan stream rendering:
- queue only while plan output is actively streaming in TUI
(`plan_stream_controller.is_some()`)

## Why

This preserves Plan mode for the remainder of the in-flight Plan turn,
so streamed plan deltas continue rendering correctly, while still
ensuring the follow-up queued message is sent with the intended
collaboration mode.

## Behavior after this change

- If a nudged/steer submission happens while Plan output is actively
streaming:
  - message is queued
  - UI stays in Plan mode for the running turn
- once dequeued/submitted, mode override is applied and the message is
sent in the intended mode
- If no Plan stream is active:
- submission proceeds immediately and mode override is applied as before

## Tests

Added/updated coverage in `tui/src/chatwidget/tests.rs`:

- `submit_user_message_with_mode_queues_while_plan_stream_is_active`
  - asserts mode remains Plan while queued
- asserts mode switches to Code when queued message is actually
submitted
- `submit_user_message_with_mode_submits_when_plan_stream_is_not_active`
- `steer_enter_queues_while_plan_stream_is_active`
- `steer_enter_submits_when_plan_stream_is_not_active`

Also updated existing `UserMessage { ... }` test fixtures to include the
new field.

## Codex author
`codex fork 019c1047-d5d5-7c92-a357-6009604dc7e8`
Adds app configs to config.toml + tests
…openai#10420)

## Summary
Add explicit, model-visible network policy decision metadata to blocked
proxy responses/errors.

Introduces a standardized prefix line: `CODEX_NETWORK_POLICY_DECISION
{json}`

and wires it through blocked paths for:
- HTTP requests
- HTTPS CONNECT
- SOCKS5 TCP/UDP denials

## Why
The model should see *why* a request was blocked
(reason/source/protocol/host/port) so it can choose the correct next
action.

## Notes
- This PR is intentionally independent of config-layering/network-rule
runtime integration.
- Focus is blocked decision surface only.
…icy (openai#10814)

## Summary

- Add seccomp deny rules for `io_uring` syscalls in the Linux sandbox
network policy.
- Specifically deny:
  - `SYS_io_uring_setup`
  - `SYS_io_uring_enter`
  - `SYS_io_uring_register`
## Problem
The first user turn can pay websocket handshake latency even when a
session has already started. We want to reduce that initial delay while
preserving turn semantics and avoiding any prompt send during startup.

Reviewer feedback also called out duplicated connect/setup paths and
unnecessary preconnect state complexity.

## Mental model
`ModelClient` owns session-scoped transport state. During session
startup, it can opportunistically warm one websocket handshake slot. A
turn-scoped `ModelClientSession` adopts that slot once if available,
restores captured sticky turn-state, and otherwise opens a websocket
through the same shared connect path.

If startup preconnect is still in flight, first turn setup awaits that
task and treats it as the first connection attempt for the turn.

Preconnect is handshake-only. The first `response.create` is still sent
only when a turn starts.

## Non-goals
This change does not make preconnect required for correctness and does
not change prompt/turn payload semantics. It also does not expand
fallback behavior beyond clearing preconnect state when fallback
activates.

## Tradeoffs
The implementation prioritizes simpler ownership and shared connection
code over header-match gating for reuse. The single-slot cache keeps
lifecycle straightforward but only benefits the immediate next turn.

Awaiting in-flight preconnect has the same app-level connect-timeout
semantics as existing websocket connect behavior (no new timeout class
introduced by this PR).

## Architecture
`core/src/client.rs`:
- Added session-level preconnect lifecycle state (`Idle` / `InFlight` /
`Ready`) carrying one warmed websocket plus optional captured
turn-state.
- Added `pre_establish_connection()` startup warmup and `preconnect()`
handshake-only setup.
- Deduped auth/provider resolution into `current_client_setup()` and
websocket handshake wiring into `connect_websocket()` /
`build_websocket_headers()`.
- Updated turn websocket path to adopt preconnect first, await in-flight
preconnect when present, then create a new websocket only when needed.
- Ensured fallback activation clears warmed preconnect state.
- Added documentation for lifecycle, ownership, sticky-routing
invariants, and timeout semantics.

`core/src/codex.rs`:
- Session startup invokes `model_client.pre_establish_connection(...)`.
- Turn metadata resolution uses the shared timeout helper.

`core/src/turn_metadata.rs`:
- Centralized shared timeout helper used by both turn-time metadata
resolution and startup preconnect metadata building.

`core/tests/common/responses.rs` + websocket test suites:
- Added deterministic handshake waiting helper (`wait_for_handshakes`)
with bounded polling.
- Added startup preconnect and in-flight preconnect reuse coverage.
- Fallback expectations now assert exactly two websocket attempts in
covered scenarios (startup preconnect + turn attempt before fallback
sticks).

## Observability
Preconnect remains best-effort and non-fatal. Existing
websocket/fallback telemetry remains in place, and debug logs now make
preconnect-await behavior and preconnect failures easier to reason
about.

## Tests
Validated with:
1. `just fmt`
2. `cargo test -p codex-core websocket_preconnect -- --nocapture`
3. `cargo test -p codex-core websocket_fallback -- --nocapture`
4. `cargo test -p codex-core
websocket_first_turn_waits_for_inflight_preconnect -- --nocapture`
…tory (openai#10574)

## Summary

When replaying compacted history (especially `replacement_history` from
remote compaction), we should not keep stale developer messages from
older session state. This PR trims developer-
role messages from compacted replacement history and reinjects fresh
developer instructions derived from current turn/session state.

This aligns compaction replay behavior with the intended "fresh
instructions after summary" model.

## Problem

Compaction replay had two paths:

- `Compacted { replacement_history: None }`: rebuilt with fresh initial
context
- `Compacted { replacement_history: Some(...) }`: previously used raw
replacement history as-is

The second path could carry stale developer instructions
(permissions/personality/collab-mode guidance) across session changes.

## What Changed

### 1) Added helper to refresh compacted developer instructions

- **File:** `codex-rs/core/src/compact.rs`
- **Function:** `refresh_compacted_developer_instructions(...)`

Behavior:
- remove all `ResponseItem::Message { role: "developer", .. }` from
compacted history
- append fresh developer messages from current
`build_initial_context(...)`

### 2) Applied helper in remote compaction flow

- **File:** `codex-rs/core/src/compact_remote.rs`
- After receiving compact endpoint output, refresh developer
instructions before replacing history and persisting
`replacement_history`.

### 3) Applied helper while reconstructing history from rollout

- **File:** `codex-rs/core/src/codex.rs`
- In `reconstruct_history_from_rollout(...)`, when processing
`Compacted` entries with `replacement_history`, refresh developer
instructions instead of directly replacing with raw history.

## Non-Goals / Follow-up

This PR does **not** address the existing first-turn-after-resume
double-injection behavior.
A follow-up PR will handle resume-time dedup/idempotence separately.

If you want, I can also give you a shorter “squash-merge friendly”
version of the description.

## Codex author
`codex fork 019c25e6-706e-75d1-9198-688ec00a8256`
…enai#10928)

These fields had always been documented as experimental/unstable with
docstrings, but now let's actually use the `experimental` annotation to
be more explicit.

- thread/start.experimentalRawEvents
- thread/resume.history
- thread/resume.path
- thread/fork.path
- turn/start.collaborationMode
- account/login/start.chatgptAuthTokens
- Return compaction errors from local and remote compaction flows.\n-
Stop turns/tasks when auto-compaction fails instead of continuing
execution.
**Test plan**

```
cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \
  ./target/debug/codex \
    --enable responses_websockets_v2 \
    --profile byok \
    --full-auto
```
…t + resume (openai#10855)

## What changed

- In `codex-rs/core/src/skills/injection.rs`, we now honor explicit
`UserInput::Skill { name, path }` first, then fall back to text mentions
only when safe.
- In `codex-rs/tui/src/bottom_pane/chat_composer.rs`, mention selection
is now token-bound (selected mention is tied to the specific inserted
`$token`), and we snapshot bindings at submit time so selection is not
lost.
- In `codex-rs/tui/src/chatwidget.rs` and
`codex-rs/tui/src/bottom_pane/mod.rs`, submit/queue paths now consume
the submit-time mention snapshot (instead of rereading cleared composer
state).
- In `codex-rs/tui/src/mention_codec.rs` and
`codex-rs/tui/src/bottom_pane/chat_composer_history.rs`, history now
round-trips mention targets so resume restores the same selected
duplicate.
- In `codex-rs/tui/src/bottom_pane/skill_popup.rs` and
`codex-rs/tui/src/bottom_pane/chat_composer.rs`, duplicate labels are
normalized to `[Repo]` / `[App]`, app rows no longer show `Connected -`,
and description space is a bit wider.

<img width="550" height="163" alt="Screenshot 2026-02-05 at 9 56 56 PM"
src="https://github.com/user-attachments/assets/346a7eb2-a342-4a49-aec8-68dfec0c7d89"
/>
<img width="550" height="163" alt="Screenshot 2026-02-05 at 9 57 09 PM"
src="https://github.com/user-attachments/assets/5e04d9af-cccf-4932-98b3-c37183e445ed"
/>


## Before vs now

- Before: selecting a duplicate could still submit the default/repo
match, and resume could lose which duplicate was originally selected.
- Now: the exact selected target (skill path or app id) is preserved
through submit, queue/restore, and resume.

## Manual test

1. Build and run this branch locally:
   - `cd /Users/daniels/code/codex/codex-rs`
   - `cargo build -p codex-cli --bin codex`
   - `./target/debug/codex`
2. Open mention picker with `$` and pick a duplicate entry (not the
first one).
3. Confirm duplicate UI:
   - repo duplicate rows show `[Repo]`
   - app duplicate rows show `[App]`
   - app description does **not** start with `Connected -`
4. Submit the prompt, then press Up to restore draft and submit again.  
   Expected: it keeps the same selected duplicate target.
5. Use `/resume` to reopen the session and send again.  
Expected: restored mention still resolves to the same duplicate target.
…enai#10938)

This PR makes `Config.apps `experimental-only and fixes a TS schema
post-processing bug that removed needed imports. The bug happened
because import pruning only checked the inner type body after filtering,
not the full alias, so `JsonValue` got dropped from `Config.ts`. We now
prune against the full alias body and added a regression test for this
scenario.
…penai#10947)

TLDR: use new message phase field emitted by preamble-supported models
to determine whether an AgentMessage is mid-turn commentary. if so,
restore the status indicator afterwards to indicate the turn has not
completed.

### Problem
`commit_tick` hides the status indicator while streaming assistant text.
For preamble-capable models, that text can be commentary mid-turn, so
hiding was correct during streaming but restore timing mattered:
- restoring too aggressively caused jitter/flashing
- not restoring caused indicator to stay hidden before subsequent work
(tool calls, web search, etc.)

### Fix
- Add optional `phase` to `AgentMessageItem` and propagate it from
`ResponseItem::Message`
- Keep indicator hidden during streamed commit ticks, restore only when:
  - assistant item completes as `phase=commentary`, and
  - stream queues are idle + task is still running.
- Treat `phase=None` as final-answer behavior (no restore) to keep
existing behavior for non-preamble models

### Tests
Add/update tests for:
- no idle-tick restore without commentary completion
- commentary completion restoring status before tool begin
- snapshot coverage for preamble/status behavior

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
…ai#10965)

Summary:
- Rename config table from network_proxy to network.
- Flatten allowed_domains, denied_domains, allow_unix_sockets, and
allow_local_binding onto NetworkProxySettings.
- Update runtime, state constraints, tests, and README to the new config
shape.
## Summary

Stabilize v2 review integration tests by making them hermetic with
respect to model discovery.

`app-server` review tests were intermittently timing out in CI
(especially on Windows runners) because their test config allowed remote
model refresh. During `thread/start`, the test process could issue live
`/v1/models` requests, introducing external network latency and
nondeterministic timing before review flow assertions.

This change disables remote model fetching in the review test config
helper used by these tests.
… storms (openai#10710)

This PR changes stdio MCP child processes to run in their own process
group
* Add guarded teardown in codex-rmcp-client: send SIGTERM to the group
first, then SIGKILL after a short grace period.
* Add terminate_process_group helper in process_group.rs.
* Add Unix regression test in process_group_cleanup.rs to verify wrapper
+ grandchild are reaped on client drop.

Addresses reported MCP process/thread storm: openai#10581
…penai#10964)

This PR makes it possible to disable live web search via an enterprise
config even if the user is running in `--yolo` mode (though cached web
search will still be available). To do this, create
`/etc/codex/requirements.toml` as follows:

```toml
# "live" is not allowed; "disabled" is allowed even though not listed explicitly.
allowed_web_search_modes = ["cached"]
```

Or set `requirements_toml_base64` MDM as explained on
https://developers.openai.com/codex/security/#locations.

### Why
- Enforce admin/MDM/`requirements.toml` constraints on web-search
behavior, independent of user config and per-turn sandbox defaults.
- Ensure per-turn config resolution and review-mode overrides never
crash when constraints are present.

### What
- Add `allowed_web_search_modes` to requirements parsing and surface it
in app-server v2 `ConfigRequirements` (`allowedWebSearchModes`), with
fixtures updated.
- Define a requirements allowlist type (`WebSearchModeRequirement`) and
normalize semantics:
  - `disabled` is always implicitly allowed (even if not listed).
  - An empty list is treated as `["disabled"]`.
- Make `Config.web_search_mode` a `Constrained<WebSearchMode>` and apply
requirements via `ConstrainedWithSource<WebSearchMode>`.
- Update per-turn resolution (`resolve_web_search_mode_for_turn`) to:
- Prefer `Live → Cached → Disabled` when
`SandboxPolicy::DangerFullAccess` is active (subject to requirements),
unless the user preference is explicitly `Disabled`.
- Otherwise, honor the user’s preferred mode, falling back to an allowed
mode when necessary.
- Update TUI `/debug-config` and app-server mapping to display
normalized `allowed_web_search_modes` (including implicit `disabled`).
- Fix web-search integration tests to assert cached behavior under
`SandboxPolicy::ReadOnly` (since `DangerFullAccess` legitimately prefers
`live` when allowed).
Fixes openai#10869

- Gate TUI rate-limit polling on ChatGPT-auth providers only.
- `prefetch_rate_limits()` now checks `should_prefetch_rate_limits()`.
- New gate requires:
  - `config.model_provider.requires_openai_auth`
  - cached auth is ChatGPT (`CodexAuth::is_chatgpt_auth`)
- Prevents `/wham/usage` polling in API/custom-endpoint profiles.
…10921)

<img width="785" height="185" alt="Screenshot 2026-02-06 at 10 25 13 AM"
src="https://github.com/user-attachments/assets/402a6e79-4626-4df9-b3da-bc2f28e64611"
/>

<img width="784" height="213" alt="Screenshot 2026-02-06 at 10 26 37 AM"
src="https://github.com/user-attachments/assets/cf9614b2-aa1e-4c61-8579-1d2c7e1c7dc1"
/>

"left/right to navigate questions" in request_user_input footer
I did not wait for CI on openai#10980
because it was blocking an alpha release, but apparently it broken the
Windows build.
@michiosw michiosw changed the base branch from main to kontext-dev February 7, 2026 12:16
@michiosw michiosw merged commit ca87185 into kontext-dev Feb 7, 2026
4 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.