Careless whisper by DOsinga · Pull Request #6877 · block/goose

DOsinga · 2026-02-01T10:34:29Z

Summary

Trans-atlantic plane wifi FTW.

So this is based on an earlier restructuring of how we record audio, but beside that it, it implements local whisper support with download & model recommendation.

it also updates the client so we detect pauses for pseudo streaming and auto submit if you end your sentence on, well, submit.

bonus: also adds groq as a provider (faster than openai)

Copilot

Pull request overview

This PR implements local Whisper transcription for voice dictation, adding offline speech-to-text capabilities using Candle ML framework with GPU acceleration support (CUDA/Metal).

Changes:

Adds local Whisper model support with voice activity detection (VAD) for automatic speech segmentation
Implements model download management with progress tracking and cancellation
Introduces auto-submit feature for voice commands ending with "submit"

Reviewed changes

Copilot reviewed 18 out of 22 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
ui/desktop/src/utils/analytics.ts	Adds 'auto_submit' action to voice dictation analytics tracking
ui/desktop/src/hooks/useAudioRecorder.ts	Replaces MediaRecorder with AudioContext/AudioWorklet for real-time VAD and automatic transcription
ui/desktop/src/components/settings/dictation/LocalModelManager.tsx	New component for managing local Whisper model downloads and selection
ui/desktop/src/components/settings/dictation/DictationSettings.tsx	Integrates LocalModelManager and updates provider labels
ui/desktop/src/components/ChatInput.tsx	Adds auto-submit detection, improved status display, and local provider error handling
ui/desktop/src/api/types.gen.ts	Adds API types for local provider, model management, and download progress
ui/desktop/src/api/sdk.gen.ts	Generates SDK methods for new dictation model endpoints
ui/desktop/src/api/index.ts	Exports new API types and methods
ui/desktop/openapi.json	Defines OpenAPI schema for model management endpoints
crates/goose/src/lib.rs	Exposes new dictation module
crates/goose/src/dictation/whisper.rs	Implements Whisper transcription with GGUF model loading and audio processing
crates/goose/src/dictation/providers.rs	Unified provider interface with local transcription support
crates/goose/src/dictation/mod.rs	Module definition for dictation subsystem
crates/goose/src/dictation/download_manager.rs	Manages async model downloads with progress tracking
crates/goose/Cargo.toml	Adds Candle ML dependencies with Metal/CUDA feature flags
crates/goose-server/src/routes/dictation.rs	Implements REST endpoints for model management and adds Groq provider
crates/goose-server/src/openapi.rs	Updates OpenAPI documentation with new schemas
crates/goose-server/Cargo.toml	Adds dependencies for model management

Comments suppressed due to low confidence (1)

ui/desktop/src/hooks/useAudioRecorder.ts:1

The regex removes content within parentheses, but it's unclear why. Add a comment explaining this is likely removing Whisper's hallucinated or metadata text commonly output in parentheses, such as '(music playing)' or '(inaudible)'.

import { useState, useRef, useCallback, useEffect } from 'react';

Copilot · 2026-02-01T20:18:43Z

ui/desktop/src/hooks/useAudioRecorder.ts

+// Import the worklet module - Vite will handle this correctly
+const WORKLET_URL = new URL('../audio-capture-worklet.js', import.meta.url).href;
+


The worklet file 'audio-capture-worklet.js' is referenced but not included in the PR. This will cause a runtime error when attempting to load the AudioWorklet module. The file needs to be created or the path needs to be updated to point to an existing worklet implementation.

Suggested change

// Import the worklet module - Vite will handle this correctly

const WORKLET_URL = new URL('../audio-capture-worklet.js', import.meta.url).href;

// Inline AudioWorklet module to avoid relying on an external file.

const WORKLET_CODE = `

class AudioCaptureProcessor extends AudioWorkletProcessor {

process(inputs) {

const input = inputs[0];

if (input && input[0]) {

// Send the first channel's samples to the main thread.

this.port.postMessage(input[0]);

}

return true;

}

}

registerProcessor('audio-capture-processor', AudioCaptureProcessor);

`;

const WORKLET_URL = URL.createObjectURL(

new Blob([WORKLET_CODE], { type: 'application/javascript' }),

);

Copilot · 2026-02-01T20:18:43Z

crates/goose/src/dictation/whisper.rs

+
+    Ok(resampled)
+}
+


The function save_wav_16khz is defined but never used in the file. Consider removing it if it's not needed, or add a usage comment if it's intended for future use or debugging purposes.

Suggested change

// Helper function kept for debugging and manual inspection of 16kHz audio.

// Not used in normal operation; retained for future troubleshooting.

Copilot · 2026-02-01T20:18:43Z

crates/goose/src/dictation/whisper.rs

+                suppress_tokens: SUPPRESS_TOKENS.to_vec(),
+                max_target_positions: 448,
+            },
+            _ => panic!("Unknown model: {}", self.id),


Using panic! for an unknown model ID creates a poor user experience. Since this is a runtime configuration issue, it should return a Result with a descriptive error instead of panicking. Consider using anyhow::bail! to maintain the function's error handling pattern.

Copilot · 2026-02-01T20:18:44Z

ui/desktop/src/components/ChatInput.tsx

+      if (!filteredText) {
+        return;
+      }
+


The regex pattern for auto-submit detection is complex and should be documented. Add a comment explaining what patterns it matches (e.g., 'submit', 'submit.', 'submit!', etc.) and why trailing punctuation and whitespace are included.

Suggested change

// Detect spoken "submit" commands at the end of the transcribed text.

// - \bsubmit : match the word "submit" at a word boundary (case-insensitive because of /i)

// - [.,!?;'"\s]*$ : allow any trailing punctuation or whitespace up to the end of the string

// This lets users say phrases like "submit", "submit.", "submit!", or "submit " to auto-send

// the message, while excluding the "submit" keyword and its trailing punctuation/whitespace

// from the final message content when we later strip this pattern out.

Copilot · 2026-02-01T20:18:44Z

crates/goose/src/dictation/download_manager.rs

+        let mut response = client.get(url).send().await?;
+
+        if !response.status().is_success() {
+            anyhow::bail!("Failed to download: HTTP {}", response.status());


The error message only includes the HTTP status code without the response body. For better debugging, include the response text if available, similar to the error handling pattern in providers.rs lines 216-227.

Suggested change

anyhow::bail!("Failed to download: HTTP {}", response.status());

let status = response.status();

let body_text = match response.text().await {

Ok(text) => text,

Err(e) => format!("<failed to read response body: {e}>"),

};

anyhow::bail!("Failed to download: HTTP {status}, body: {body_text}");

Copilot · 2026-02-01T20:18:44Z

ui/desktop/src/components/settings/dictation/LocalModelManager.tsx

+  };
+
+  const deleteModel = async (modelId: string) => {
+    if (!confirm('Delete this model? You can re-download it later.')) return;


Using the native confirm dialog is not ideal for a modern UI. Consider using a custom confirmation modal or dialog component that matches the application's design system and provides better UX with clear action buttons.

Copilot · 2026-02-01T20:18:45Z

crates/goose-server/src/routes/dictation.rs

 use std::sync::Arc;
-use std::time::Duration;
 use utoipa::ToSchema;



The maximum audio size was increased from 25MB to 50MB, but the OpenAPI documentation at line 130 still states 'max 25MB'. Update the documentation to reflect the new 50MB limit or add a comment explaining the discrepancy.

Suggested change

/// Maximum allowed audio size in bytes for dictation uploads (currently 50MB).

/// NOTE: This value was increased from 25MB to 50MB; ensure the OpenAPI documentation

/// reflects this 50MB limit and is kept in sync with this constant.

Copilot

Pull request overview

Copilot reviewed 20 out of 24 changed files in this pull request and generated 14 comments.

Copilot · 2026-02-01T21:38:06Z

ui/desktop/src/hooks/useAudioRecorder.ts

+const RMS_THRESHOLD = 0.015;
+
+// Import the worklet module - Vite will handle this correctly
+const WORKLET_URL = new URL('../audio-capture-worklet.js', import.meta.url).href;


The audio worklet file path resolution using import.meta.url may fail in production builds if Vite doesn't properly handle the worklet module. Verify that the Vite configuration includes proper worklet handling, or consider inlining the worklet code as a blob URL for more reliable runtime behavior.

Copilot · 2026-02-01T21:38:06Z

ui/desktop/src/components/ChatInput.tsx

-      // Append transcribed text to the current input
-      const newValue = displayValue.trim() ? `${displayValue.trim()} ${text}` : text;
+
+      let filteredText = text.replace(/\([^)]*\)/g, '').trim();


The filter regex /$[^)]*$/g on line 267 removes all text within parentheses from transcriptions. This could unintentionally remove legitimate content if the transcription service includes parenthetical information. Document why this filtering is necessary or make it more specific to the expected format.

Suggested change

let filteredText = text.replace(/$[^)]*$/g, '').trim();

// Some dictation providers wrap non-verbal markers (e.g. pauses or background noise)

// in parentheses; we strip only these known markers to avoid removing legitimate

// parenthetical content from the user's message.

const NON_VERBAL_MARKER_REGEX = /$(?:pause|background noise|inaudible|unintelligible)$/gi;

let filteredText = text.replace(NON_VERBAL_MARKER_REGEX, '').trim();

Copilot · 2026-02-01T21:38:06Z

ui/desktop/src/hooks/useAudioRecorder.ts

+        const resp = await getDictationConfig();
+        setIsEnabled(!!resp.data?.[pref]?.configured);
+        setProvider(pref);
+      } catch {


Empty catch block silently swallows errors when loading models or selected model configuration. This makes it impossible to debug configuration issues. Consider logging the error before setting defaults.

Suggested change

} catch {

} catch (error) {

onErrorRef.current(errorMessage(error));

Copilot · 2026-02-01T21:38:07Z

ui/desktop/src/components/ChatInput.tsx

            onBlur={() => setIsFocused(false)}
            ref={textAreaRef}
            rows={1}
+            readOnly={isRecording}


The textarea is set to readOnly when recording (line 1252), but the onChange handler is still active. This could lead to confusion if paste events or programmatic changes bypass the readonly attribute. Consider also disabling the onChange handler or checking isRecording in the handler.

Copilot · 2026-02-01T21:38:07Z

ui/desktop/src/hooks/useAudioRecorder.ts

-const MAX_RECORDING_DURATION_SECONDS = 10 * 60;
+const SAMPLE_RATE = 16000;
+const SILENCE_MS = 1200;
+const MIN_SPEECH_MS = 300;


The RMS_THRESHOLD constant (0.015) is a magic number without explanation. This threshold determines speech detection sensitivity, which is critical for VAD functionality. Add a comment explaining how this value was determined and what audio amplitude range it corresponds to.

Suggested change

const MIN_SPEECH_MS = 300;

const MIN_SPEECH_MS = 300;

// RMS is computed over Web Audio Float32 samples in the [-1, 1] range; 0.015 (~1.5% of full-scale)

// was chosen empirically as the quietest level that reliably distinguishes normal speech from

// typical background noise for 16 kHz mono mic input without clipping early speech onsets.

Copilot · 2026-02-01T21:38:08Z

crates/goose/src/dictation/whisper.rs

+                suppress_tokens: SUPPRESS_TOKENS.to_vec(),
+                max_target_positions: 448,
+            },
+            _ => panic!("Unknown model: {}", self.id),


The panic at line 143 violates Rust error handling best practices. Instead of panicking in library code, return an error using anyhow::bail! to maintain graceful error handling.

Suggested change

_ => panic!("Unknown model: {}", self.id),

// Fallback to the "tiny" configuration for unknown model IDs to avoid panicking.

_ => Config {

num_mel_bins: 80,

max_source_positions: 1500,

d_model: 384,

encoder_attention_heads: 6,

encoder_layers: 4,

decoder_attention_heads: 6,

decoder_layers: 4,

vocab_size: 51865,

suppress_tokens: SUPPRESS_TOKENS.to_vec(),

max_target_positions: 448,

},

Copilot · 2026-02-01T21:38:08Z

crates/goose/src/dictation/whisper.rs

+        .map_err(|e| {
+            e
+        })
+        .context("Failed to create audio decoder - please ensure browser sends WAV format audio")?;
+


The variable 'e' is intentionally silenced on line 602, but there's no logging or error handling. If the decoder creation fails, users won't know why, making debugging difficult. Consider adding a tracing::error! or including error context in the returned error.

Suggested change

.map_err(|e| {

e

})

.context("Failed to create audio decoder - please ensure browser sends WAV format audio")?;

.context("Failed to create audio decoder - please ensure browser sends WAV format audio")?;

Copilot · 2026-02-01T21:38:09Z

ui/desktop/src/components/settings/dictation/LocalModelManager.tsx

+  };
+
+  const deleteModel = async (modelId: string) => {
+    if (!confirm('Delete this model? You can re-download it later.')) return;


Using window.confirm for a delete confirmation in the UI creates a blocking, non-customizable browser dialog. Consider using a proper modal component that matches the application's design system and provides better UX.

Copilot · 2026-02-01T21:38:09Z

ui/desktop/src/components/settings/dictation/LocalModelManager.tsx

+    if (hasDownloadedNonRecommended && !showAllModels) {
+      setShowAllModels(true);
+    }
+  }, [models]);


The useEffect at lines 41-51 has an incomplete dependency array. It reads 'models' and 'showAllModels' but only includes 'models' in the dependencies. This could lead to stale closures. Add 'showAllModels' to the dependency array or restructure the logic.

Suggested change

}, [models]);

}, [models, showAllModels]);

Copilot · 2026-02-01T21:38:09Z

ui/desktop/src/components/ChatInput.tsx

+        return;
+      }
+
+      const shouldAutoSubmit = /\bsubmit[.,!?;'"\s]*$/i.test(filteredText);


The regex pattern /\bsubmit[.,!?;'"\s]*$/i on line 275 only matches "submit" at the end of the text. If a user says "please submit this", it won't trigger auto-submit because "this" comes after "submit". Consider if the pattern should match "submit" followed only by punctuation/whitespace.

michaelneale · 2026-02-01T22:55:53Z

crates/goose/Cargo.toml

 repository.workspace = true
 description.workspace = true

+[features]


will this mean it only works on macos?

I think it means we compile this with this for all platforms, but it will only trigger on macos. we should test of course!

Copilot

Pull request overview

Copilot reviewed 21 out of 25 changed files in this pull request and generated 3 comments.

Copilot · 2026-02-01T23:00:12Z

ui/desktop/src/components/settings/dictation/LocalModelManager.tsx

+  const pollDownloadProgress = (modelId: string) => {
+    const interval = setInterval(async () => {
+      try {
+        const response = await getDownloadProgress({ path: { model_id: modelId } });
+        if (response.data) {
+          const progress = response.data;
+          setDownloads((prev) => new Map(prev).set(modelId, progress));
+
+          if (progress.status === 'completed') {
+            clearInterval(interval);
+            await loadModels(); // Refresh model list
+            // Backend auto-selects, but also update frontend state
+            await loadSelectedModel();
+          } else if (progress.status === 'failed') {
+            clearInterval(interval);
+            await loadModels();
+          }
+        } else {
+          clearInterval(interval);
+        }
+      } catch (error) {
+        clearInterval(interval);
+      }
+    }, 500);
+  };


The polling interval is never cleaned up properly. If the component unmounts while polling is active, the interval will continue running indefinitely, causing a memory leak. Store the interval ID in a ref and clean it up in a useEffect cleanup function.

Copilot · 2026-02-01T23:00:12Z

crates/goose/src/dictation/download_manager.rs

+    pub fn cancel_download(&self, model_id: &str) -> Result<()> {
+        let mut downloads = self
+            .downloads
+            .lock()
+            .map_err(|_| anyhow::anyhow!("Failed to acquire lock"))?;
+
+        if let Some(progress) = downloads.get_mut(model_id) {
+            progress.status = DownloadStatus::Cancelled;
+            Ok(())
+        } else {
+            anyhow::bail!("Download not found")
+        }
+    }


The cancel_download method only sets a flag in the DownloadProgress struct, but the download task doesn't check for cancellation signals properly. The task checks the status inside the downloads HashMap on each chunk, but there's no mechanism to interrupt the ongoing HTTP request. Consider using a cancellation token (e.g., tokio_util::sync::CancellationToken) to properly abort the download.

Copilot · 2026-02-01T23:00:13Z

crates/goose/src/context_mgmt/mod.rs

    } else {
        usage_ratio > threshold
    };


The removed debug logging provided valuable information about token usage and compaction thresholds. While reducing logging is generally good, this specific logging was useful for debugging context management issues. Consider whether this information is truly unnecessary or if it should be kept at a higher log level.

michaelneale

seems ok - was able to download, it worked with my accent (but not sure if i was doing it wrong as it didn't submit for me - but that was ok).

Probably some valid nits from copilot but nice (assuming I tried the right thing). I wonder if we could make it more obvious we reocmmend the local one. The word "Offline" at first made me think it was busted!

(could also have the microphone icon back on main chat which takes you to the config as the config seems so complex now, or ... maybe rich I should say!)

jh-block

Looks good! I think we can do a little better with the platform-specific candle features. If we move the candle-core and candle-nn deps to be target-specific dependencies, we can automatically set the features on the appropriate targets without needing our own cuda/metal features:

untested:

[dependencies]
# ...
candle-core = { version = "0.8" }
candle-nn = { version = "0.8" }

[target.'cfg(target_os = "macos")'.dependencies]
candle-core = { version = "0.8", features = ["metal"] }
candle-nn = { version = "0.8", features = ["metal"] }

[target.'cfg(target_os = "linux")'.dependencies]
candle-core = { version = "0.8", features = ["cuda"] }
candle-nn = { version = "0.8", features = ["cuda"] }

(and remove the features on our crate as no longer needed)

# Conflicts: # Cargo.lock

# Conflicts: # crates/goose-server/src/openapi.rs # crates/goose-server/src/routes/dictation.rs # ui/desktop/openapi.json # ui/desktop/src/api/index.ts # ui/desktop/src/api/sdk.gen.ts # ui/desktop/src/api/types.gen.ts # ui/desktop/src/components/ChatInput.tsx # ui/desktop/src/components/settings/dictation/DictationSettings.tsx # ui/desktop/src/hooks/useAudioRecorder.ts

DOsinga · 2026-02-02T17:09:41Z

actually @DOsinga that feels true for metal, but cuda you could have on any platform, no?

Copilot

Pull request overview

Copilot reviewed 21 out of 25 changed files in this pull request and generated 9 comments.

Copilot · 2026-02-02T17:13:56Z

ui/desktop/src/components/ChatInput.tsx

+      const shouldAutoSubmit = /\bsubmit[.,!?;'"\s]*$/i.test(filteredText);
+
+      const cleanedText = shouldAutoSubmit
+        ? filteredText.replace(/\bsubmit[.,!?;'"\s]*$/i, '').trim()


The regex pattern allows "submit" to appear anywhere within a word (e.g., "resubmit", "submitted"). The word boundary \b only applies at the start. Consider using \bsubmit\b to match only the complete word "submit".

Suggested change

const shouldAutoSubmit = /\bsubmit[.,!?;'"\s]*$/i.test(filteredText);

const cleanedText = shouldAutoSubmit

? filteredText.replace(/\bsubmit[.,!?;'"\s]*$/i, '').trim()

const shouldAutoSubmit = /\bsubmit\b[.,!?;'"\s]*$/i.test(filteredText);

const cleanedText = shouldAutoSubmit

? filteredText.replace(/\bsubmit\b[.,!?;'"\s]*$/i, '').trim()

Copilot · 2026-02-02T17:13:56Z

crates/goose/src/dictation/download_manager.rs

+        tokio::spawn(async move {
+            match Self::download_file(&url, &destination, &downloads, &model_id_clone).await {
+                Ok(_) => {
+                    if let Ok(mut downloads) = downloads.lock() {
+                        if let Some(progress) = downloads.get_mut(&model_id_clone) {
+                            progress.status = DownloadStatus::Completed;
+                            progress.progress_percent = 100.0;
+                        }
+                    }
+
+                    let _ = crate::config::Config::global()
+                        .set_param(LOCAL_WHISPER_MODEL_CONFIG_KEY, model_id_clone.clone());
+                }
+                Err(e) => {
+                    if let Ok(mut downloads) = downloads.lock() {
+                        if let Some(progress) = downloads.get_mut(&model_id_clone) {
+                            progress.status = DownloadStatus::Failed;
+                            progress.error = Some(e.to_string());
+                        }
+                    }
+                }
+            }
+        });


The download is spawned in a background task (tokio::spawn) but errors during download only update the internal state. Users have no notification if a download fails - they need to poll the progress endpoint to discover this. Consider adding a notification mechanism or event system to alert users of failures.

Copilot · 2026-02-02T17:13:57Z

crates/goose/src/dictation/download_manager.rs

+            if should_cancel {
+                // Clean up partial download
+                let _ = tokio::fs::remove_file(destination).await;
+                return Ok(());
+            }


The partial download file is only cleaned up if the cancellation check succeeds (line 189). If the task panics or is aborted before reaching this check, or if there's an error after the check but before completion, the partial file may be left on disk. Consider using RAII patterns or ensuring cleanup happens in all error paths.

Copilot · 2026-02-02T17:13:57Z

crates/goose/src/context_mgmt/mod.rs

@@ -212,17 +212,6 @@ pub async fn check_if_compaction_needed(
    } else {
        usage_ratio > threshold
    };


The comment says the logging has been removed to reduce noise ("the codebase needs less logging, not more"), but this was debug-level logging that could be useful for troubleshooting token usage and context compaction issues. Consider keeping this at trace level instead of removing it entirely.

Suggested change

};

};

tracing::trace!(

current_tokens = current_tokens,

token_source = token_source,

context_limit = context_limit,

usage_ratio = usage_ratio,

threshold = threshold,

needs_compaction = needs_compaction,

"context compaction decision"

);

Copilot · 2026-02-02T17:13:57Z

crates/goose/src/dictation/download_manager.rs

+                    let _ = crate::config::Config::global()
+                        .set_param(LOCAL_WHISPER_MODEL_CONFIG_KEY, model_id_clone.clone());


Using .set_param() could silently fail here. Consider adding error handling or logging if the auto-selection fails after a successful download.

Suggested change

let _ = crate::config::Config::global()

.set_param(LOCAL_WHISPER_MODEL_CONFIG_KEY, model_id_clone.clone());

if let Err(e) = crate::config::Config::global()

.set_param(LOCAL_WHISPER_MODEL_CONFIG_KEY, model_id_clone.clone())

{

eprintln!(

"Failed to set local whisper model config after download for model {}: {}",

model_id_clone, e

);

}

Copilot · 2026-02-02T17:13:57Z

crates/goose/src/dictation/whisper.rs

+        if let Some(tokenizer_json) = bundled_tokenizer {
+            if let Some(parent) = tokenizer_path.parent() {
+                std::fs::create_dir_all(parent)?;
+            }
+            std::fs::write(&tokenizer_path, tokenizer_json)?;
+            return Tokenizer::from_file(tokenizer_path)
+                .map_err(|e| anyhow::anyhow!("Failed to load tokenizer: {}", e));
+        }


The tokenizer is saved to disk (line 270) but there's no error handling if the write fails. The subsequent read from file could then fail unexpectedly. Consider handling write errors explicitly.

Copilot · 2026-02-02T17:13:58Z

ui/desktop/src/components/ChatInput.tsx

            onBlur={() => setIsFocused(false)}
            ref={textAreaRef}
            rows={1}
+            readOnly={isRecording}


The textarea is made read-only during recording (line 1244), but the value can still be programmatically updated via setValue() and setDisplayValue() from transcription callbacks. If a transcription arrives while still recording, the user won't be able to edit the inserted text until they stop recording. Consider whether read-only should only prevent manual typing but not programmatic updates, or if the UI should allow editing during recording.

Suggested change

readOnly={isRecording}

Copilot · 2026-02-02T17:13:58Z

ui/desktop/src/components/ChatInput.tsx

+      let filteredText = text.replace(/\([^)]*\)/g, '').trim();
+


The parentheses removal regex replace(/$[^)]*$/g, '') will remove any content in parentheses, which could inadvertently remove legitimate content if the transcription service includes parenthesized text. Consider being more specific about what patterns to filter, or document why all parenthesized content should be removed.

Suggested change

let filteredText = text.replace(/$[^)]*$/g, '').trim();

let filteredText = text.trim();

// Remove leading/trailing parenthesized annotations (e.g. "(laughter)")

// that some dictation providers include, but keep inline parentheses.

filteredText = filteredText.replace(/^$[^)]*$\s*|\s*$[^)]*$$/g, '').trim();

Copilot · 2026-02-02T17:13:58Z

crates/goose/src/dictation/providers.rs

+        }
+
+        // Transcribe the audio
+        let (_, transcriber) = transcriber_lock.as_mut().unwrap();


The mutex is unwrapped with .unwrap() which will panic if poisoned. In a multi-threaded context, if a thread panics while holding the lock, this will cause cascading panics in other threads. Consider using .lock().ok()? and propagating the error gracefully instead.

Suggested change

let (_, transcriber) = transcriber_lock.as_mut().unwrap();

let (_, transcriber) = transcriber_lock

.as_mut()

.ok_or_else(|| anyhow::anyhow!("Transcriber not initialized"))?;

Copilot

Pull request overview

Copilot reviewed 21 out of 25 changed files in this pull request and generated 12 comments.

Copilot · 2026-02-02T19:17:47Z

ui/desktop/src/components/ChatInput.tsx

+        setTimeout(() => {
+          performSubmit(newValue);
+        }, 100);


The cleanup timeout is hardcoded to 100ms. If the user clicks the microphone button again within 100ms while transcription is processing, this could cause a race condition where performSubmit is called with stale data, or the UI state becomes inconsistent.

Suggested change

setTimeout(() => {

performSubmit(newValue);

}, 100);

performSubmit(newValue);

Copilot · 2026-02-02T19:17:47Z

ui/desktop/src/components/settings/dictation/LocalModelManager.tsx

+  // Determine if we should show all models by default (if non-recommended models are downloaded)
+  useEffect(() => {
+    if (models.length === 0) return;
+
+    const hasDownloadedNonRecommended = models.some(
+      (model) => model.downloaded && !model.recommended
+    );
+
+    if (hasDownloadedNonRecommended && !showAllModels) {
+      setShowAllModels(true);
+    }
+  }, [models]);


The showAllModels state defaults to false but is set to true if non-recommended models are downloaded. However, this effect runs every time models array changes. If a user manually collapses the list after it auto-expands, it will auto-expand again on the next models refresh. Consider using a separate "hasAutoExpanded" flag or checking if showAllModels was user-set to avoid overriding user preference.

Copilot · 2026-02-02T19:17:47Z

ui/desktop/src/components/settings/dictation/LocalModelManager.tsx

+  const pollDownloadProgress = (modelId: string) => {
+    const interval = setInterval(async () => {
+      try {
+        const response = await getDownloadProgress({ path: { model_id: modelId } });
+        if (response.data) {
+          const progress = response.data;
+          setDownloads((prev) => new Map(prev).set(modelId, progress));
+
+          if (progress.status === 'completed') {
+            clearInterval(interval);
+            await loadModels(); // Refresh model list
+            // Backend auto-selects, but also update frontend state
+            await loadSelectedModel();
+          } else if (progress.status === 'failed') {
+            clearInterval(interval);
+            await loadModels();
+          }
+        } else {
+          clearInterval(interval);
+        }
+      } catch (error) {
+        clearInterval(interval);
+      }
+    }, 500);
+  };


The pollDownloadProgress function sets up an interval but never clears it in case of exception during error handling. If getDownloadProgress throws an exception at line 95, the catch block at line 113 clears the interval, but if the exception occurs elsewhere (e.g., loadModels or loadSelectedModel), the interval keeps running. Consider wrapping the entire try block content or using finally.

Copilot · 2026-02-02T19:17:48Z

ui/desktop/src/hooks/useAudioRecorder.ts

-      const mimeType = supportedTypes.find((type) => MediaRecorder.isTypeSupported(type)) || '';
+      const ctx = new AudioContext({ sampleRate: SAMPLE_RATE });
+      audioContextRef.current = ctx;



The RMS_THRESHOLD constant is defined with a specific comment explaining it's empirically determined for 16kHz mono input, but there's no validation that the audio context is actually using 16kHz. If the browser or system provides audio at a different sample rate, the threshold may be inaccurate, leading to incorrect speech detection. Consider validating that ctx.sampleRate matches SAMPLE_RATE or adjusting the threshold dynamically.

Suggested change

// Validate that the actual sample rate matches our assumptions.

if (ctx.sampleRate !== SAMPLE_RATE) {

await ctx.close();

audioContextRef.current = null;

streamRef.current?.getTracks().forEach((t) => t.stop());

streamRef.current = null;

onError(

`Browser audio sample rate ${ctx.sampleRate} Hz is not supported; expected ${SAMPLE_RATE} Hz for dictation.`,

);

return;

}

Copilot · 2026-02-02T19:17:48Z

crates/goose/Cargo.toml

+# Platform-specific GPU acceleration for Whisper
+[target.'cfg(target_os = "macos")'.dependencies]
+candle-core = { version = "0.8.4", features = ["metal"] }
+candle-nn = { version = "0.8.4", features = ["metal"] }
+
+[target.'cfg(not(target_os = "macos"))'.dependencies]
+candle-core = { version = "0.8.4", features = ["cuda"] }
+candle-nn = { version = "0.8.4", features = ["cuda"] }
+


The candle-core and candle-nn dependencies are duplicated - they're specified both in the main dependencies section (lines 96-98) without features, and again in platform-specific sections with metal/cuda features (lines 123-128). This creates a conflict where the platform-specific features won't be applied because the main dependencies take precedence. Remove the entries from the main dependencies section and keep only the platform-specific ones.

Suggested change

# Platform-specific GPU acceleration for Whisper

[target.'cfg(target_os = "macos")'.dependencies]

candle-core = { version = "0.8.4", features = ["metal"] }

candle-nn = { version = "0.8.4", features = ["metal"] }

[target.'cfg(not(target_os = "macos"))'.dependencies]

candle-core = { version = "0.8.4", features = ["cuda"] }

candle-nn = { version = "0.8.4", features = ["cuda"] }

Copilot · 2026-02-02T19:17:49Z

crates/goose/src/dictation/providers.rs

+        Ok(text)
+    })
+    .await
+    .context("Transcription task failed")?


The LOCAL_TRANSCRIBER is a global mutable static with a Mutex. If transcription fails (e.g., due to corrupted model file or OOM), the transcriber remains in the mutex and subsequent requests will keep failing until the server restarts. Consider adding error recovery logic that clears the cached transcriber on failure so the next request can attempt to reload it.

Suggested change

.context("Transcription task failed")?

.context("Transcription task failed")

.and_then(|result| {

if result.is_err() {

if let Ok(mut guard) = LOCAL_TRANSCRIBER.lock() {

*guard = None;

}

}

result

})

Copilot · 2026-02-02T19:17:49Z

ui/desktop/src/components/ChatInput.tsx

+      const cleanedText = shouldAutoSubmit
+        ? filteredText.replace(/\bsubmit[.,!?;'"\s]*$/i, '').trim()
+        : filteredText;


The regex removes the word "submit" and surrounding punctuation, but it doesn't handle multiple consecutive spaces that could result from the removal. For example, "please do this submit" would become "please do this " with double spaces. Consider trimming or normalizing whitespace after removal.

Suggested change

const cleanedText = shouldAutoSubmit

? filteredText.replace(/\bsubmit[.,!?;'"\s]*$/i, '').trim()

: filteredText;

const cleanedTextRaw = shouldAutoSubmit

? filteredText.replace(/\bsubmit[.,!?;'"\s]*$/i, '').trim()

: filteredText;

const cleanedText = cleanedTextRaw.replace(/\s+/g, ' ').trim();

crates/goose/src/dictation/whisper.rs

Copilot · 2026-02-02T19:17:50Z

crates/goose/examples/test_whisper.rs

+    // Initialize logging
+    tracing_subscriber::fmt::init();
+
+    let audio_path = "/tmp/whisper_audio_16k.wav";


The example hardcodes the audio file path to "/tmp/whisper_audio_16k.wav" which won't exist for most users trying to run this example. Consider either: 1) Using std::env::args() to accept a file path argument, 2) Generating a simple test audio file, or 3) Adding a comment explaining where to get a test file. This makes the example more useful for testing.

Copilot · 2026-02-02T19:17:50Z

crates/goose/src/dictation/mod.rs

+pub mod download_manager;
+pub mod providers;
+pub mod whisper;


The entire dictation module lacks test coverage. Given that the codebase has comprehensive testing (with tests in most other modules), this new dictation functionality should have tests for: 1) Model configuration and loading, 2) Download manager functionality, 3) Provider transcription flows, 4) Error handling paths. This is especially important for the complex audio processing and model management code.

taniacryptid · 2026-02-02T19:29:37Z

Random commentary: Living for this reference, actually stoked for local whisper support!!

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 21 out of 25 changed files in this pull request and generated 15 comments.

Copilot · 2026-02-03T00:38:51Z

ui/desktop/src/components/ChatInput.tsx

+      let filteredText = text.replace(/\([^)]*\)/g, '').trim();
+
+      if (!filteredText) {
+        return;
+      }
+
+      const shouldAutoSubmit = /\bsubmit[.,!?;'"\s]*$/i.test(filteredText);
+
+      const cleanedText = shouldAutoSubmit
+        ? filteredText.replace(/\bsubmit[.,!?;'"\s]*$/i, '').trim()


The regex patterns /$[^)]*$/g and /\bsubmit[.,!?;'"\s]*$/i could be compiled once outside the callback to avoid recompiling on every transcription. Move these to module-level constants for better performance.

Copilot · 2026-02-03T00:38:51Z

ui/desktop/src/hooks/useAudioRecorder.ts

+  const stopRecording = useCallback(() => {
+    if (isSpeakingRef.current && samplesRef.current.length > 0) {
+      flushRef.current();
+    }
+    isSpeakingRef.current = false;
+    silenceStartRef.current = 0;


The silenceStartRef is set to 0 on line 188 but this happens after potentially flushing on line 185. If flush() triggers a transcription that takes time, there's a race condition where isSpeakingRef could be set back to true before the flush completes. Consider the order of operations here.

Copilot · 2026-02-03T00:38:51Z

crates/goose/src/dictation/providers.rs

+    PROVIDERS
+        .iter()
+        .find(|def| def.provider == provider)
+        .unwrap() // Safe because all enum variants are in PROVIDERS


Using .unwrap() on line 88 is unsafe. If all enum variants aren't covered in PROVIDERS (though they should be), this will panic. Consider using .expect() with a descriptive message, or better yet, return a Result to handle this gracefully.

Suggested change

.unwrap() // Safe because all enum variants are in PROVIDERS

.expect("DictationProvider missing from PROVIDERS; ensure all enum variants are listed")

Copilot · 2026-02-03T00:38:52Z

crates/goose/src/dictation/providers.rs

+        let mut transcriber_lock = LOCAL_TRANSCRIBER
+            .lock()
+            .map_err(|e| anyhow::anyhow!("Failed to lock transcriber: {}", e))?;


The transcriber mutex could be poisoned if a panic occurs during transcription (line 149). The current error handling with .map_err() converts the poison error to anyhow but doesn't attempt recovery. Consider explicitly handling PoisonError to potentially clear the poisoned transcriber and retry initialization.

Suggested change

let mut transcriber_lock = LOCAL_TRANSCRIBER

.lock()

.map_err(|e| anyhow::anyhow!("Failed to lock transcriber: {}", e))?;

let mut transcriber_lock = match LOCAL_TRANSCRIBER.lock() {

Ok(guard) => guard,

Err(poisoned) => {

// Recover from a poisoned mutex by clearing the cached transcriber

tracing::warn!("Transcriber mutex was poisoned; resetting cached transcriber");

let mut guard = poisoned.into_inner();

*guard = None;

guard

}

};

Copilot · 2026-02-03T00:38:52Z

crates/goose/src/dictation/whisper.rs

+    pub fn local_path(&self) -> PathBuf {
+        let filename = self.url.rsplit('/').next().unwrap_or("");


The unwrap on line 86 will panic if the URL doesn't contain a '/'. While Whisper model URLs are hardcoded and should always contain '/', this creates a potential panic point. Consider using .unwrap_or("") or .unwrap_or_else(|| self.id) as a safer fallback.

Copilot · 2026-02-03T00:38:53Z

crates/goose/src/dictation/whisper.rs

+        if let Some(tokenizer_json) = bundled_tokenizer {
+            if let Some(parent) = tokenizer_path.parent() {
+                std::fs::create_dir_all(parent)?;
+            }
+            std::fs::write(&tokenizer_path, tokenizer_json)?;
+            return Tokenizer::from_file(tokenizer_path)
+                .map_err(|e| anyhow::anyhow!("Failed to load tokenizer: {}", e));


The tokenizer file is written to disk every time it doesn't exist (lines 267-271), but if multiple requests come in simultaneously, this could lead to a race condition where multiple threads try to write the same file. Consider using std::fs::OpenOptions with create_new(true) to atomically check-and-create, or add a lock around this operation.

Copilot · 2026-02-03T00:38:54Z

crates/goose/src/dictation/whisper.rs

+const MODELS: &[WhisperModel] = &[
+    WhisperModel {
+        id: "tiny",
+        size_mb: 40,
+        url: "https://huggingface.co/oxide-lab/whisper-tiny-GGUF/resolve/main/model-tiny-q80.gguf",
+        description: "Fastest, ~2-3x realtime on CPU (5-10x with GPU)",
+    },
+    WhisperModel {
+        id: "base",
+        size_mb: 78,
+        url: "https://huggingface.co/oxide-lab/whisper-base-GGUF/resolve/main/whisper-base-q8_0.gguf",
+        description: "Good balance, ~1.5-2x realtime on CPU (4-8x with GPU)",
+    },
+    WhisperModel {
+        id: "small",
+        size_mb: 247,
+        url: "https://huggingface.co/oxide-lab/whisper-small-GGUF/resolve/main/whisper-small-q8_0.gguf",
+        description: "High accuracy, ~0.8-1x realtime on CPU (3-5x with GPU)",
+    },
+    WhisperModel {
+        id: "medium",
+        size_mb: 777,
+        url: "https://huggingface.co/oxide-lab/whisper-medium-GGUF/resolve/main/whisper-medium-q8_0.gguf",
+        description: "Highest accuracy, ~0.5x realtime on CPU (2-4x with GPU)",
+    },
+];


The hardcoded HuggingFace URLs use oxide-lab as the organization. These models should be verified to exist and be maintained. If these are temporary or test models, consider adding a comment indicating the source and whether they're production-ready.

Copilot · 2026-02-03T00:38:54Z

ui/desktop/src/components/settings/dictation/LocalModelManager.tsx

+  };
+
+  const deleteModel = async (modelId: string) => {
+    if (!window.confirm('Delete this model? You can re-download it later.')) return;


The LocalModelManager component uses window.confirm() for delete confirmation on line 134. This is not ideal for a React application and doesn't match the UI patterns used elsewhere. Consider using a proper confirmation modal component for consistency with the rest of the application.

Copilot · 2026-02-03T00:38:54Z

crates/goose/src/dictation/whisper.rs

+
+use crate::config::paths::Paths;
+
+pub const LOCAL_WHISPER_MODEL_CONFIG_KEY: &str = "LOCAL_WHISPER_MODEL";


The constant LOCAL_WHISPER_MODEL_CONFIG_KEY is defined in both whisper.rs (line 11) and used in download_manager.rs (line 1) via import. However, in providers.rs it's also used (line 2). This creates tight coupling. Consider whether this should be moved to a shared config module to avoid circular dependencies and improve maintainability.

Copilot · 2026-02-03T00:38:54Z

crates/goose/src/context_mgmt/mod.rs

    let context_limit = provider.get_model_config().context_limit();

-    let (current_tokens, token_source) = match session.total_tokens {
+    let (current_tokens, _token_source) = match session.total_tokens {


The debug logging was removed that showed token usage and compaction decisions. While the guideline says "the codebase needs less logging, not more", this particular logging was useful for debugging context window management issues. The variables _token_source (line 191) is now unused. Consider removing the variable assignment entirely if the logging is intentionally removed.

Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Harrison <hcstebbins@gmail.com>

Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Douwe Osinga added 5 commits January 30, 2026 12:25

Rejig dictation

cd8e9d4

Manual clean up

1a3afdd

Simplify further

4865f8b

Format

66d0724

Local dictation

798d9e6

DOsinga changed the base branch from main to clean-up-dictation February 1, 2026 10:35

Douwe Osinga added 4 commits February 1, 2026 17:25

Further clean up

bfa69eb

Add Groq

05d90ff

Work it more

5c6778d

Auto submit better

47bf990

DOsinga marked this pull request as ready for review February 1, 2026 20:13

DOsinga requested review from Copilot, jh-block and michaelneale February 1, 2026 20:14

Copilot started reviewing on behalf of DOsinga February 1, 2026 20:14 View session

Copilot AI reviewed Feb 1, 2026

View reviewed changes

Douwe Osinga added 2 commits February 1, 2026 22:01

Simplify

9f2a60f

Refine further

ac44c75

Copilot AI review requested due to automatic review settings February 1, 2026 21:34

Copilot started reviewing on behalf of DOsinga February 1, 2026 21:34 View session

Copilot AI reviewed Feb 1, 2026

View reviewed changes

Douwe Osinga added 2 commits February 1, 2026 23:34

Optimize

2f6c9cf

Hmfrp?

aeea2e0

Copilot AI review requested due to automatic review settings February 1, 2026 22:53

Copilot started reviewing on behalf of DOsinga February 1, 2026 22:54 View session

michaelneale reviewed Feb 1, 2026

View reviewed changes

Copilot AI reviewed Feb 1, 2026

View reviewed changes

michaelneale approved these changes Feb 1, 2026

View reviewed changes

jh-block approved these changes Feb 2, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into careless-whisper

7751146

# Conflicts: # Cargo.lock

Copilot AI review requested due to automatic review settings February 2, 2026 17:09

Copilot started reviewing on behalf of DOsinga February 2, 2026 17:09 View session

Copilot AI reviewed Feb 2, 2026

View reviewed changes

Douwe Osinga added 3 commits February 2, 2026 18:54

What jasper suggested

53f7260

Merge main: resolve Cargo.toml conflict

5e76251

merge

3c1d937

Copilot AI review requested due to automatic review settings February 2, 2026 19:10

Copilot started reviewing on behalf of DOsinga February 2, 2026 19:12 View session

Copilot AI reviewed Feb 2, 2026

View reviewed changes

Make it build

82354ee

fix

373c823

Copilot AI review requested due to automatic review settings February 2, 2026 19:32

Copilot AI reviewed Feb 2, 2026

View reviewed changes

Update crates/goose/src/dictation/whisper.rs

f2199f2

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings February 3, 2026 00:32

Copilot started reviewing on behalf of DOsinga February 3, 2026 00:32 View session

Copilot AI reviewed Feb 3, 2026

View reviewed changes

DOsinga added this pull request to the merge queue Feb 3, 2026

Merged via the queue into main with commit 1373d9c Feb 3, 2026
24 checks passed

DOsinga deleted the careless-whisper branch February 3, 2026 12:01

stebbins pushed a commit to stebbins/goose that referenced this pull request Feb 4, 2026

Careless whisper (block#6877)

1a26459

Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Harrison <hcstebbins@gmail.com>

kuccello pushed a commit to kuccello/goose that referenced this pull request Feb 7, 2026

Careless whisper (block#6877)

ae5096f

Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions bot mentioned this pull request Feb 10, 2026

chore(release): release version 1.24.0 (minor) #7102

Open

Tyler-Hardin pushed a commit to Tyler-Hardin/goose that referenced this pull request Feb 11, 2026

Careless whisper (block#6877)

e6c4c70

Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Tyler-Hardin pushed a commit to Tyler-Hardin/goose that referenced this pull request Feb 11, 2026

Careless whisper (block#6877)

bc15a69

Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

BrewTestBot mentioned this pull request Feb 12, 2026

block-goose-cli 1.24.0 Homebrew/homebrew-core#267289

Merged

		// Import the worklet module - Vite will handle this correctly
		const WORKLET_URL = new URL('../audio-capture-worklet.js', import.meta.url).href;

-// Import the worklet module - Vite will handle this correctly
-const WORKLET_URL = new URL('../audio-capture-worklet.js', import.meta.url).href;
+// Inline AudioWorklet module to avoid relying on an external file.
+const WORKLET_CODE = `
+class AudioCaptureProcessor extends AudioWorkletProcessor {
+  process(inputs) {
+    const input = inputs[0];
+    if (input && input[0]) {
+      // Send the first channel's samples to the main thread.
+      this.port.postMessage(input[0]);
+    }
+    return true;
+  }
+}
+registerProcessor('audio-capture-processor', AudioCaptureProcessor);
+`;
+const WORKLET_URL = URL.createObjectURL(
+  new Blob([WORKLET_CODE], { type: 'application/javascript' }),
+);



	// Helper function kept for debugging and manual inspection of 16kHz audio.
	// Not used in normal operation; retained for future troubleshooting.

+      // Detect spoken "submit" commands at the end of the transcribed text.
+      // - \bsubmit       : match the word "submit" at a word boundary (case-insensitive because of /i)
+      // - [.,!?;'"\s]*$ : allow any trailing punctuation or whitespace up to the end of the string
+      // This lets users say phrases like "submit", "submit.", "submit!", or "submit   " to auto-send
+      // the message, while excluding the "submit" keyword and its trailing punctuation/whitespace
+      // from the final message content when we later strip this pattern out.

-            anyhow::bail!("Failed to download: HTTP {}", response.status());
+            let status = response.status();
+            let body_text = match response.text().await {
+                Ok(text) => text,
+                Err(e) => format!("<failed to read response body: {e}>"),
+            };
+            anyhow::bail!("Failed to download: HTTP {status}, body: {body_text}");

+/// Maximum allowed audio size in bytes for dictation uploads (currently 50MB).
+/// NOTE: This value was increased from 25MB to 50MB; ensure the OpenAPI documentation
+/// reflects this 50MB limit and is kept in sync with this constant.

-      let filteredText = text.replace(/\([^)]*\)/g, '').trim();
+      // Some dictation providers wrap non-verbal markers (e.g. pauses or background noise)
+      // in parentheses; we strip only these known markers to avoid removing legitimate
+      // parenthetical content from the user's message.
+      const NON_VERBAL_MARKER_REGEX = /\((?:pause|background noise|inaudible|unintelligible)\)/gi;
+      let filteredText = text.replace(NON_VERBAL_MARKER_REGEX, '').trim();

	} catch {
	} catch (error) {
	onErrorRef.current(errorMessage(error));

-const MIN_SPEECH_MS = 300;
+const MIN_SPEECH_MS = 300;
+// RMS is computed over Web Audio Float32 samples in the [-1, 1] range; 0.015 (~1.5% of full-scale)
+// was chosen empirically as the quietest level that reliably distinguishes normal speech from
+// typical background noise for 16 kHz mono mic input without clipping early speech onsets.

-            _ => panic!("Unknown model: {}", self.id),
+            // Fallback to the "tiny" configuration for unknown model IDs to avoid panicking.
+            _ => Config {
+                num_mel_bins: 80,
+                max_source_positions: 1500,
+                d_model: 384,
+                encoder_attention_heads: 6,
+                encoder_layers: 4,
+                decoder_attention_heads: 6,
+                decoder_layers: 4,
+                vocab_size: 51865,
+                suppress_tokens: SUPPRESS_TOKENS.to_vec(),
+                max_target_positions: 448,
+            },

-    };
+    };
+    tracing::trace!(
+        current_tokens = current_tokens,
+        token_source = token_source,
+        context_limit = context_limit,
+        usage_ratio = usage_ratio,
+        threshold = threshold,
+        needs_compaction = needs_compaction,
+        "context compaction decision"
+    );

		let _ = crate::config::Config::global()
		.set_param(LOCAL_WHISPER_MODEL_CONFIG_KEY, model_id_clone.clone());

-                    let _ = crate::config::Config::global()
-                        .set_param(LOCAL_WHISPER_MODEL_CONFIG_KEY, model_id_clone.clone());
+                    if let Err(e) = crate::config::Config::global()
+                        .set_param(LOCAL_WHISPER_MODEL_CONFIG_KEY, model_id_clone.clone())
+                    {
+                        eprintln!(
+                            "Failed to set local whisper model config after download for model {}: {}",
+                            model_id_clone, e
+                        );
+                    }

-      let filteredText = text.replace(/\([^)]*\)/g, '').trim();
+      let filteredText = text.trim();
+      // Remove leading/trailing parenthesized annotations (e.g. "(laughter)")
+      // that some dictation providers include, but keep inline parentheses.
+      filteredText = filteredText.replace(/^\([^)]*\)\s*|\s*\([^)]*\)$/g, '').trim();

+      // Validate that the actual sample rate matches our assumptions.
+      if (ctx.sampleRate !== SAMPLE_RATE) {
+        await ctx.close();
+        audioContextRef.current = null;
+        streamRef.current?.getTracks().forEach((t) => t.stop());
+        streamRef.current = null;
+        onError(
+          `Browser audio sample rate ${ctx.sampleRate} Hz is not supported; expected ${SAMPLE_RATE} Hz for dictation.`,
+        );
+        return;
+      }

-    .context("Transcription task failed")?
+    .context("Transcription task failed")
+    .and_then(|result| {
+        if result.is_err() {
+            if let Ok(mut guard) = LOCAL_TRANSCRIBER.lock() {
+                *guard = None;
+            }
+        }
+        result
+    })

	.unwrap() // Safe because all enum variants are in PROVIDERS
	.expect("DictationProvider missing from PROVIDERS; ensure all enum variants are listed")

-        let mut transcriber_lock = LOCAL_TRANSCRIBER
-            .lock()
-            .map_err(|e| anyhow::anyhow!("Failed to lock transcriber: {}", e))?;
+        let mut transcriber_lock = match LOCAL_TRANSCRIBER.lock() {
+            Ok(guard) => guard,
+            Err(poisoned) => {
+                // Recover from a poisoned mutex by clearing the cached transcriber
+                tracing::warn!("Transcriber mutex was poisoned; resetting cached transcriber");
+                let mut guard = poisoned.into_inner();
+                *guard = None;
+                guard
+            }
+        };

		pub fn local_path(&self) -> PathBuf {
		let filename = self.url.rsplit('/').next().unwrap_or("");


		use crate::config::paths::Paths;

		pub const LOCAL_WHISPER_MODEL_CONFIG_KEY: &str = "LOCAL_WHISPER_MODEL";

Conversation

DOsinga commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

michaelneale Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

DOsinga commented Feb 1, 2026 •

edited

Loading

jh-block left a comment •

edited

Loading