feat: expose no-speech probability in segment #2654

sachaarbonel · 2024-12-21T11:49:06Z

This PR adds support for exposing no-speech probability at the segment level in Whisper transcriptions, which helps identify potential hallucinations and non-speech segments in the output.

Key Changes

Added no_speech_prob field to whisper_segment struct
Exposed segment-level no-speech probability through new API: whisper_full_get_segment_no_speech_prob
Added configurable no_speech_thold parameter (default: 0.6) via CLI arguments
Included no_speech_prob in JSON output for each segment

Why This Matters

Helps identify potential hallucinations by detecting segments with high no-speech probability
Enables filtering of non-speech segments (background noise, silence, etc.)
Provides more granular control over transcription quality

* ggerganov/master: (49 commits) cli : add --suppress_nst support (ggerganov#2664) cli : add no_speech_thold (ggerganov#2663) cmake : remove hardcoded install rpath server : fix help print ruby : bug fix on callbacks and no_speech_prob (ggerganov#2656) server : add no-speech threshold parameter and functionality (ggerganov#2654) whisper : rename suppress_non_speech_tokens to suppress_nst (ggerganov#2653) server : add option to suppress non-speech tokens (ggerganov#2649) whisper : rename binaries + fix install (ggerganov#2648) ruby : update gem version to v1.3.1 release : v1.7.3 ci : msys enable SDL2 build (ggerganov#2635) ruby : sync ggml (ggerganov#2643) android : try to fix build files : remove old sources sync : ggml talk-llama : sync llama.cpp sync : ggml ggml : update ggml_backend_cpu_device_supports_op (llama/10867) vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809) ...

feat: add no-speech threshold parameter and functionality to server

2637b2e

sachaarbonel changed the title ~~feat: implement no_speech_prob~~ feat: add no_speech_prob to segment Dec 21, 2024

sachaarbonel changed the title ~~feat: add no_speech_prob to segment~~ feat: expose no-speech probability in segment Dec 21, 2024

ggerganov approved these changes Dec 21, 2024

View reviewed changes

ggerganov merged commit 4183517 into ggerganov:master Dec 21, 2024
42 checks passed

KitaitiMakoto mentioned this pull request Dec 21, 2024

ruby : bug fix on callbacks and no_speech_prob #2656

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: expose no-speech probability in segment #2654

feat: expose no-speech probability in segment #2654

sachaarbonel commented Dec 21, 2024 •

edited

Loading

feat: expose no-speech probability in segment #2654

feat: expose no-speech probability in segment #2654

Conversation

sachaarbonel commented Dec 21, 2024 • edited Loading

Key Changes

Why This Matters

sachaarbonel commented Dec 21, 2024 •

edited

Loading