Skip to content

faster-whisper 1.1.0

Latest
Compare
Choose a tag to compare
@MahmoudAshraf97 MahmoudAshraf97 released this 21 Nov 17:19
· 5 commits to master since this release
97a4785

New Features

  • New batched inference that is 4x faster and accurate, Refer to README on usage instructions.
  • Support for the new large-v3-turbo model.
  • VAD filter is now 3x faster on CPU.
  • Feature Extraction is now 3x faster.
  • Added log_progress to WhisperModel.transcribe to print transcription progress.
  • Added multilingual option to transcription to allow transcribing multilingual audio. Note that Large models already have codeswitching capabilities, so this is mostly beneficial to medium model or smaller.
  • WhisperModel.detect_language now has the option to use VAD filter and improved language detection using language_detection_segments and language_detection_threshold.

Bug Fixes

  • Use correct features padding for encoder input when chunk_length <30s
  • Use correct seek value in output

Other Changes

  • replace NamedTuple with dataclass in Word, Segment, TranscriptionOptions, TranscriptionInfo, and VadOptions, this allows conversion to json without nesting. Note that _asdict() method is still available in Word and Segment classes for backward compatibility but will be removed in the next release, you can use dataclasses.asdict() instead.
  • Added new tests for development
  • Updated benchmarks in the Readme
  • use jiwer instead of evaluate in benchmarks
  • Filter out non_speech_tokens in suppressed tokens by @jordimas in #898

New Contributors

Full Changelog: v1.0.3...v1.1.0