google-gemini · fayerman-source · Feb 1, 2026 · Feb 4, 2026 · Feb 5, 2026 · Feb 6, 2026
@@ -80,14 +80,28 @@ manually during a session.
 
 ### Planning Workflow
 
+Plan Mode uses an adaptive planning workflow where the research depth, plan
+structure, and consultation level are proportional to the task's complexity:
+
 1.  **Explore & Analyze:** Analyze requirements and use read-only tools to map
-    the codebase and validate assumptions. For complex tasks, identify at least
-    two viable implementation approaches.
-2.  **Consult:** Present a summary of the identified approaches via [`ask_user`]
-    to obtain a selection. For simple or canonical tasks, this step may be
-    skipped.
-3.  **Draft:** Once an approach is selected, write a detailed implementation
-    plan to the plans directory.
+    affected modules and identify dependencies.
+2.  **Consult:** The depth of consultation is proportional to the task's
+    complexity:
+    - **Simple Tasks:** Proceed directly to drafting.
+    - **Standard Tasks:** Present a summary of viable approaches via
+      [`ask_user`] for selection.
+    - **Complex Tasks:** Present detailed trade-offs for at least two viable
+      approaches via [`ask_user`] and obtain approval before drafting.
+3.  **Draft:** Write a detailed implementation plan to the
+    [plans directory](#custom-plan-directory-and-policies). The plan's structure
+    adapts to the task:
+    - **Simple Tasks:** Focused on specific **Changes** and **Verification**
+      steps.
+    - **Standard Tasks:** Includes an **Objective**, **Key Files & Context**,
+      **Implementation Steps**, and **Verification & Testing**.
+    - **Complex Tasks:** Comprehensive plans including **Background &
+      Motivation**, **Scope & Impact**, **Proposed Solution**, **Alternatives
+      Considered**, and **Migration & Rollback** strategies.
 4.  **Review & Approval:** Use the [`exit_plan_mode`] tool to present the plan
     and formally request approval.
     - **Approve:** Exit Plan Mode and start implementation.

@@ -74,6 +74,15 @@ they appear in the UI.
 | Loading Phrases                      | `ui.loadingPhrases`                    | What to show while the model is working: tips, witty comments, both, or nothing.                                                                                  | `"tips"` |
 | Screen Reader Mode                   | `ui.accessibility.screenReader`        | Render output in plain-text to be more screen reader accessible                                                                                                   | `false`  |
 
+### Voice
+
+| UI Label              | Setting                  | Description                                                                                                                                                     | Default     |
+| --------------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
+| Voice Input           | `voice.enabled`          | Enable voice input. When enabled, press **Alt+R** or **Ctrl+Q** to start/stop recording.                                                                        | `false`     |
+| Transcription Backend | `voice.provider`         | Transcription backend to use: `gemini` (zero-install, uses existing Gemini API auth) or `whisper` (local binary).                                               | `"gemini"`  |
+| Silence Detection     | `voice.silenceThreshold` | RMS energy threshold (0–1000) below which audio is discarded as silence. Lower values capture quieter speech (e.g. whispering). `0` disables silence detection. | `80`        |
+| Whisper Binary Path   | `voice.whisperPath`      | Path to the Whisper executable. Only used when `voice.provider` is `"whisper"` (e.g. `/usr/local/bin/whisper`).                                                 | `undefined` |
+
 ### IDE
 
 | UI Label | Setting       | Description                  | Default |

@@ -339,6 +339,33 @@ their corresponding top-level category object in your `settings.json` file.
   - **Default:** `false`
   - **Requires restart:** Yes
 
+#### `voice`
+
+- **`voice.enabled`** (boolean):
+  - **Description:** Enable voice input. When enabled, press **Alt+R** or
+    **Ctrl+Q** to start/stop recording. Use `/voice enable` or `/voice disable`
+    to toggle.
+  - **Default:** `false`
+
+- **`voice.provider`** (string: `"gemini"` | `"whisper"`):
+  - **Description:** Transcription backend. `gemini` uses the Gemini API with
+    your existing auth (zero additional setup). `whisper` uses a locally
+    installed Whisper binary for offline/faster transcription.
+  - **Default:** `"gemini"`
+
+- **`voice.silenceThreshold`** (number, 0–1000):
+  - **Description:** RMS energy threshold for silence detection. Audio below
+    this level is discarded without an API call. Lower values capture quieter
+    speech (e.g. whispering). Set to `0` to disable silence detection and always
+    transcribe. Use `/voice sensitivity <value>` to adjust at runtime.
+  - **Default:** `80`
+
+- **`voice.whisperPath`** (string):
+  - **Description:** Path to the Whisper executable. Only used when
+    `voice.provider` is `"whisper"` (e.g. `/usr/local/bin/whisper` or
+    `~/.local/bin/whisper`). Use `/voice set-path <path>` to set at runtime.
+  - **Default:** `undefined`
+
 #### `ide`
 
 - **`ide.enabled`** (boolean):

@@ -94,6 +94,12 @@ available combinations.
 | Open the current prompt in an external editor. | `Ctrl + X`                                                                                |
 | Paste from the clipboard.                      | `Ctrl + V`<br />`Cmd + V`<br />`Alt + V`                                                  |
 
+#### Voice Input
+
+| Action                                          | Keys                      |
+| ----------------------------------------------- | ------------------------- |
+| Toggle voice input recording (Alt+R or Ctrl+Q). | `Alt + R`<br />`Ctrl + Q` |
+
 #### App Controls
 
 | Action                                                                                                                                             | Keys             |

@@ -97,6 +97,9 @@ export enum Command {
   CLEAR_SCREEN = 'app.clearScreen',
   RESTART_APP = 'app.restart',
   SUSPEND_APP = 'app.suspend',
+
+  // Voice Input
+  VOICE_INPUT = 'input.voice',
 }
 
 /**
@@ -297,6 +300,12 @@ export const defaultKeyBindings: KeyBindingConfig = {
   [Command.CLEAR_SCREEN]: [{ key: 'l', ctrl: true }],
   [Command.RESTART_APP]: [{ key: 'r' }],
   [Command.SUSPEND_APP]: [{ key: 'z', ctrl: true }],
+
+  // Voice Input
+  [Command.VOICE_INPUT]: [
+    { key: 'r', alt: true }, // Alt+R
+    { key: 'q', ctrl: true }, // Ctrl+Q
+  ],
 };
 
 interface CommandCategory {
@@ -391,6 +400,10 @@ export const commandCategories: readonly CommandCategory[] = [
       Command.PASTE_CLIPBOARD,
     ],
   },
+  {
+    title: 'Voice Input',
+    commands: [Command.VOICE_INPUT],
+  },
   {
     title: 'App Controls',
     commands: [
@@ -525,4 +538,7 @@ export const commandDescriptions: Readonly<Record<Command, string>> = {
   [Command.CLEAR_SCREEN]: 'Clear the terminal screen and redraw the UI.',
   [Command.RESTART_APP]: 'Restart the application.',
   [Command.SUSPEND_APP]: 'Suspend the CLI and move it to the background.',
+
+  // Voice Input
+  [Command.VOICE_INPUT]: 'Toggle voice input recording (Alt+R or Ctrl+Q).',
 };
@@ -766,6 +766,57 @@ const SETTINGS_SCHEMA = {
     },
   },
 
+  voice: {
+    type: 'object',
+    label: 'Voice Input',
+    category: 'General',
+    requiresRestart: false,
+    default: {},
+    description: 'Settings for voice input.',
+    properties: {
+      enabled: {
+        type: 'boolean',
+        label: 'Enable Voice Input',
+        category: 'General',
+        requiresRestart: false,
+        default: false,
+        description: 'Enable voice input support.',
+        showInDialog: true,
+      },
+      provider: {
+        type: 'string',
+        label: 'Transcription Backend',
+        category: 'General',
+        requiresRestart: false,
+        default: undefined as string | undefined,
+        description:
+          'Transcription backend: "gemini" (default, zero-install) or "whisper" (local).',
+        showInDialog: true,
+      },
+      whisperPath: {
+        type: 'string',
+        label: 'Whisper Binary Path',
+        category: 'General',
+        requiresRestart: false,
+        default: undefined as string | undefined,
+        description:
+          'Path to the whisper executable. Only used when provider is "whisper".',
+        showInDialog: true,
+      },
+      silenceThreshold: {
+        type: 'number',
+        label: 'Silence Detection Threshold',
+        category: 'General',
+        requiresRestart: false,
+        default: 80,
+        description:
+          'RMS energy threshold (0–1000) below which audio is discarded as silence. ' +
+          'Lower values allow quieter speech such as whispering. 0 disables silence detection.',
+        showInDialog: true,
+      },
+    },
+  },
+
   ide: {
     type: 'object',
     label: 'IDE',

@@ -247,11 +247,12 @@ export async function runNonInteractive({
           settings,
         );
         // If a slash command is found and returns a prompt, use it.
-        // Otherwise, slashCommandResult falls through to the default prompt
-        // handling.
+        // Otherwise, if it was a slash command, we are done.
         if (slashCommandResult) {
           // eslint-disable-next-line @typescript-eslint/no-unsafe-type-assertion
           query = slashCommandResult as Part[];
+        } else {
+          return;
         }
       }
 

@@ -90,6 +90,11 @@ export const handleSlashCommand = async (
         switch (result.type) {
           case 'submit_prompt':
             return result.content;
+          case 'message': {
+            const prefix = result.messageType?.toUpperCase() || 'INFO';
+            process.stdout.write(`[${prefix}] ${result.content}\n`);
+            return;
+          }
           case 'confirm_shell_commands':
             // This result indicates a command attempted to confirm shell commands.
             // However note that currently, ShellTool is excluded in non-interactive

@@ -58,6 +58,7 @@ import { shellsCommand } from '../ui/commands/shellsCommand.js';
 import { vimCommand } from '../ui/commands/vimCommand.js';
 import { setupGithubCommand } from '../ui/commands/setupGithubCommand.js';
 import { terminalSetupCommand } from '../ui/commands/terminalSetupCommand.js';
+import { voiceCommand } from '../ui/commands/voiceCommand.js';
 
 /**
  * Loads the core, hard-coded slash commands that are an integral part
@@ -73,7 +74,10 @@ export class BuiltinCommandLoader implements ICommandLoader {
    * @param _signal An AbortSignal (unused for this synchronous loader).
    * @returns A promise that resolves to an array of `SlashCommand` objects.
    */
-  async loadCommands(_signal: AbortSignal): Promise<SlashCommand[]> {
+  async loadCommands(signal: AbortSignal): Promise<SlashCommand[]> {
+    if (signal.aborted) {
+      return [];
+    }
     const handle = startupProfiler.start('load_builtin_commands');
 
     const isNightlyBuild = await isNightly(process.cwd());
@@ -185,6 +189,7 @@ export class BuiltinCommandLoader implements ICommandLoader {
       vimCommand,
       setupGithubCommand,
       terminalSetupCommand,
+      voiceCommand,
     ];
     handle?.end();
     return allDefinitions.filter((cmd): cmd is SlashCommand => cmd !== null);

@@ -34,6 +34,8 @@ import {
 import { type HistoryItemToolGroup, StreamingState } from '../ui/types.js';
 import { ToolActionsProvider } from '../ui/contexts/ToolActionsContext.js';
 import { AskUserActionsProvider } from '../ui/contexts/AskUserActionsContext.js';
+import { VoiceContext } from '../ui/contexts/VoiceContext.js';
+import type { VoiceInputReturn } from '../ui/hooks/useVoiceInput.js';
 import { TerminalProvider } from '../ui/contexts/TerminalContext.js';
 import {
   OverflowProvider,
@@ -554,6 +556,19 @@ export const mockAppState: AppState = {
   startupWarnings: [],
 };
 
+const mockVoiceReturn: VoiceInputReturn = {
+  isEnabled: true,
+  state: {
+    isRecording: false,
+    isTranscribing: false,
+    error: null,
+  },
+  startRecording: vi.fn(async () => {}),
+  stopRecording: vi.fn(async () => {}),
+  cancelRecording: vi.fn(async () => {}),
+  toggleRecording: vi.fn(async () => {}),
+};
+
 const mockUIActions: UIActions = {
   handleThemeSelect: vi.fn(),
   closeThemeDialog: vi.fn(),
@@ -634,6 +649,7 @@ export const renderWithProviders = (
     uiActions,
     persistentState,
     appState = mockAppState,
+    voice = mockVoiceReturn,
   }: {
     shellFocus?: boolean;
     settings?: LoadedSettings;
@@ -648,6 +664,7 @@ export const renderWithProviders = (
       set?: typeof persistentStateMock.set;
     };
     appState?: AppState;
+    voice?: VoiceInputReturn;
   } = {},
 ): RenderInstance & {
   simulateClick: (
@@ -741,32 +758,34 @@ export const renderWithProviders = (
                           config={config}
                           toolCalls={allToolCalls}
                         >
-                          <AskUserActionsProvider
-                            request={null}
-                            onSubmit={vi.fn()}
-                            onCancel={vi.fn()}
-                          >
-                            <KeypressProvider>
-                              <MouseProvider
-                                mouseEventsEnabled={mouseEventsEnabled}
-                              >
-                                <TerminalProvider>
-                                  <ScrollProvider>
-                                    <ContextCapture>
-                                      <Box
-                                        width={terminalWidth}
-                                        flexShrink={0}
-                                        flexGrow={0}
-                                        flexDirection="column"
-                                      >
-                                        {component}
-                                      </Box>
-                                    </ContextCapture>
-                                  </ScrollProvider>
-                                </TerminalProvider>
-                              </MouseProvider>
-                            </KeypressProvider>
-                          </AskUserActionsProvider>
+                          <VoiceContext.Provider value={voice}>
+                            <AskUserActionsProvider
+                              request={null}
+                              onSubmit={vi.fn()}
+                              onCancel={vi.fn()}
+                            >
+                              <KeypressProvider>
+                                <MouseProvider
+                                  mouseEventsEnabled={mouseEventsEnabled}
+                                >
+                                  <TerminalProvider>
+                                    <ScrollProvider>
+                                      <ContextCapture>
+                                        <Box
+                                          width={terminalWidth}
+                                          flexShrink={0}
+                                          flexGrow={0}
+                                          flexDirection="column"
+                                        >
+                                          {component}
+                                        </Box>
+                                      </ContextCapture>
+                                    </ScrollProvider>
+                                  </TerminalProvider>
+                                </MouseProvider>
+                              </KeypressProvider>
+                            </AskUserActionsProvider>
+                          </VoiceContext.Provider>
                         </ToolActionsProvider>
                       </OverflowProvider>
                     </UIActionsContext.Provider>