All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- Support for VAD filter when using fasterwhisper
- Implemented retry when using Apertium translation API
- Better ffmpeg error login and robust execution
- Ability to add original and dubbed subtitles as stream in the output video
- Reduced the number of warnings from 3rd party libraries
- Switch to ffmpeg to adjust the audio speed since pydub did not work in some cases
- Standardize the naming of some cli parameters
- Retry if Edge TTS fails to provide the synthesis. Happens sometimes
- Improved 'update' command which updates now utterances file and checks files needed
- Remove empty blocks of dubbed audios that do not contain text
- Speed calculation: when it's the last block, not to increase the speed if is not needed
- Support for manually postediting the automatically dubbing (--update)
- Support for Whisper large-v2 model (better than v3 for some languages)
- Do not need to merge back audios that have not been dubbed
- If a file merge file fails, do not fail the whole batch
- Support for TTS which implement an API contract (allows your own TTS)
- Error values to control externally why open-dubbing is exiting (see exit_code.py)
- Do not need to merge back audios that have not been dubbed
- If a file merge file fails, do not fail the whole batch
- Allow to pass the select device to an external TTS activated by cli
- Coqui as optional dependency
- Allow to select logging level
- Updated dependencies
- Support for any TTS which can be invoked from the command line
- Support for building in Windows. Tests pass.
- Allow to define region for target language (like ES-MX) used for TTS
- Only speed audios when it's really needed improving quality of final audio synthesis
- Support for Apertium API as translation engine
- Allow to select between different model sizes for NLLB translation engine
- Allow to select between different model sizes for Whisper speech to text engine
- Check if ffmpeg is installed and report if it is not
- Autodetect language using Whisper if source language is not specified
- Use Edge TTS native speed parameter when need to increase the speed
- Better performance when separating vocals
- Support for Microsoft Edge TTS
- Gender classifier to identify gender in the original video and produce the synthetic voices in target language that match the gender
- Initial version