Conversation
- Hold Space on empty composer to record; release to transcribe - Block input and show 'Recording' hint while capturing - Send audio to OpenAI Whisper (whisper-1) via reqwest multipart - Resolve API key via codex_login auth (no env var read) - Insert transcription into composer Add cpal + hound deps for audio capture + WAV encoding.
- Insert atomic textarea element when transcription starts - Keep textarea fully editable; element moves with edits - Replace element by id when Whisper result returns; fallback insert at cursor - Add element id support to TextArea (named elements + replace by id) - Switch to AppEvent::TranscriptionComplete(id, text)
- Add AppEvent::TranscriptionFailed { id, error }
- On error, delete the placeholder element; leave editor state intact
- Fix voice thread to send failure event with correct id
- Keep success path replacing placeholder by id
…ng' on release - Insert named 'recording' element at start of capture - On stop, change the same element text to 'transcribing' and send audio - Remove footer 'Recording' hint
- Add TextArea::update_named_element_by_id to preserve element id - On PageDown release, update existing element text to 'transcribing' - Final transcription replaces element with plain text; errors delete it - Route keys while recording; stop on Release or next key
- Use webrtc-vad to detect voiced frames (10ms) - Aggressive mode + 200ms padding to avoid clipping - Downmix to mono, resample to supported rates - Trim leading/trailing silence before upload - Skip upload and remove placeholder if no speech - Add webrtc-vad dependency to TUI
Fix push-to-talk voice mode where PageDown release didn't trigger transcription because Release events were filtered at the app layer. Now all key events are forwarded, allowing the composer to stop recording on release and send audio for transcription immediately.
- Short-clip handling: remove placeholder without transcribing when <1s - Hold-to-talk: start immediately on empty textarea; skip space + delay - Disable VAD trimming; always send full clip - Add live recording meter with adaptive gain and compression - Animate via new AppEvent::RecordingMeter and in-place updates - Use atomic peak from audio callback to avoid blocking audio thread - Normalize audio (peak with headroom) before WAV upload - History nav: trigger on Press/Repeat only - Hide cursor while recording - Meter UI: 12-char sparkline, scrolling left, no label
- Remove unused functions (to_mono_i16, resample_linear_i16, detect_voiced_bounds_webrtc) - Prune unused imports (std::convert::TryFrom, webrtc-vad types) - Remove webrtc-vad from tui/Cargo.toml - Delete unused local in recording meter task No behavior change; voice still records and transcribes full clip. Ran fmt/fix and tests for codex-tui.
- Remove AppEvent::SpaceHoldTimeout and app/chatwidget/bottom_pane handlers - Manage 500ms hold via tokio::spawn that flips an atomic flag - Convert to recording on next input event when flag is observed Behavior: identical in typical terminals; on non-repeat terminals, starts on next key event after timeout.
…repeats - Drop id from hold state and conversions - Spawn tokio task that flips atomic flag and schedules a frame - Process conversion in a new pre_draw_tick called before rendering - Pass FrameRequester into ChatComposer; update tests accordingly No AppEvent used for timeout; behavior now independent of key repeat.
…tick - Remove key-event path for timeout processing; rely on frame scheduled by timer - Keep local tokio task + atomic flag approach; fewer code paths All tests pass.
- Replace static "transcribing" with animated braille spinner frames via RecordingMeter updates - Spinner auto-stops after max duration or when placeholder is replaced/removed All TUI tests pass.
- Insert a named element containing a space on Space press - On release or cancel, replace the element with a plain space - On timeout, remove the element and begin recording Keeps behavior while simplifying state (no index math). All tests pass.
- Add stop_recording_and_start_transcription() and call from handle_key_event - Keeps behavior; improves readability and testability All TUI tests pass.
- Add start_recording_with_placeholder() and reuse for empty-text space press and hold-timeout - Keeps behavior; consolidates meter placeholder + spawn logic All TUI tests pass.
…lean up on drop - Maintain stop flags for spinner tasks; stop on replace/remove or when update fails - Implement Drop for ChatComposer to stop spinners and end capture on teardown - Make RecordingMeter path schedule a frame only when update applied This avoids runaway spinner tasks across UI changes (e.g., NewSession). All tests pass.
…ance and 60s cap - Remove explicit spinner stop flags and stop calls - Spinner tasks auto-expire after 60s; UI ignores updates once placeholder is gone - Keep Drop minimal: stop capture and clear placeholder All TUI tests pass.
…isappearance and 60s cap" This reverts commit 5461929.
- Add ChatComposer helpers (ta_* wrappers) that auto-sync popups after text changes - Use wrappers for programmatic edits (placeholders, spinner frames, space-hold element) - Remove scattered manual sync calls accordingly All TUI tests pass.
…y paths - Revert to direct TextArea calls - Ensure sync_command_popup/sync_file_search_popup are called in event handlers and key paths - Keep on-space-hold timeout and recording flows consistent All TUI tests pass.
- Centralize sync in handle_key_event end; for early-return branches, perform sync then return - Remove ad-hoc syncs added inside match branches now covered by centralized sync All TUI tests pass.
- Add ChatComposer::sync_popups() to unify command/file popup updates - Call sync_popups after key events; remove scattered explicit sync calls - BottomPane now triggers sync_popups after events (key, paste, inserts, pre-draw, history, transcription) - Keeps behavior consistent and simplifies control flow; tests and snapshots pass
Collaborator
Author
|
@aibrahim-oai for the activation timing, the intention is that if the composer is empty, then pressing space should immediately trigger voice input. if there's text in the composer, there should be a delay. the theory is that when you're talking about increasing time to activate, are you talking about the empty-composer state or the non-empty-composer state? |
Collaborator
|
I meant empty composer. I think it's unexpected. feels a bit buggy. |
# Conflicts: # codex-rs/app-server/tests/suite/v2/thread_resume.rs
46a2439 to
e4c09b4
Compare
# Conflicts: # codex-rs/tui/src/bottom_pane/mod.rs
# Conflicts: # MODULE.bazel.lock # codex-rs/Cargo.lock # codex-rs/tui/src/chatwidget.rs
# Conflicts: # MODULE.bazel.lock
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds voice transcription on press-and-hold of spacebar.
Screen.Recording.2025-09-19.at.12.24.02.PM.mov