This project provides two standalone tools for Video Subtitling and Text-to-Speech generation using OpenAI's models. The Video Subtitler extracts and transcribes audio from video files, while the Text-to-Speech Generator converts text files into speech. Both tools are distributed as .exe
files for easy use, and they leverage OpenAI’s Whisper for transcription and TTS for generating speech.
- Transcribes audio and video files using OpenAI's Whisper speech-to-text model.
- Supports video formats like
.mp4
,.mov
,.mpeg
, and more. - Extracts audio from video files and splits large audio files for optimal transcription.
- Offers transcription correction using GPT for refined and accurate text output.
- Customizable with language and prompt settings.
- Converts text files into speech using OpenAI's TTS model.
- Customizable voice selection and speed settings for more control over the output.
- Outputs audio as
.mp3
files.
FFmpeg is required to handle video and audio processing. Install FFmpeg based on your operating system:
-
Windows Option 1: Install using winget (recommended):
- Open Command Prompt as Administrator and run:
winget install ffmpeg
- Open Command Prompt as Administrator and run:
-
Windows Option 2: Download from the official website:
- Download the latest build from the official website.
- Extract the ZIP file to a location on your computer, e.g.,
C:\ffmpeg
. - Add the
bin
directory offfmpeg
to your system's PATH.
-
macOS: Install via Homebrew:
brew install ffmpeg
-
Linux: Install via APT:
sudo apt-get install ffmpeg
- Obtain an OpenAI API key from OpenAI's platform.
- Set this API key in the
config.yaml
file as described in the configuration section.
-
Download the ZIP:
- Go to the Releases page on GitHub and download the latest
.zip
file. - Extract the contents to a directory on your system.
- Go to the Releases page on GitHub and download the latest
-
Set Up Configuration:
- Rename the
config.example.yaml
file toconfig.yaml
and set up the configuration parameters. - Basically you only need to set the
openai.api_key
parameter with your OpenAI API key. - The
prompt
parameter can be used to teach Video Subtitler about special words or phrases (trademarks, unusual names) that may appear in your video. - Example
config.yaml
:openai: api_key: sk-XXX stt_model: whisper-1 tts_model: tts-1 completions_model: gpt-4o temperature: 0 default: language: EN stt_prompt: PhraseVault, Video Subtitler tts_voice: echo tts_speed: 1
- Rename the
-
Clone the Repository:
- Clone the repository to your local machine:
git clone https://github.com/ptmrio/video-subtitler.git cd video-subtitler
- Clone the repository to your local machine:
-
Set Up Configuration:
- see step 2 in the Windows installation instructions.
-
Install Dependencies:
- Install the required Python packages:
pip install -r requirements.txt
- Install the required Python packages:
-
Run the Application:
- Run the application using Python:
or
python video_subtitler.py
python text_to_speech.py
- Run the application using Python:
- Navigate to the downloaded and extracted folder and run the
text-to-speech.exe
file. - Enter the path to your
.txt
text-file, customize the voice and speed (if necessary), and click Generate Speech. - The application will convert the text into speech and save the result as an
.tts.mp3
file.
- Navigate to the downloaded and extracted folder and run the
video-subtitler.exe
file. - Provide the path to your audio or video file and configure optional settings such as language or custom prompts.
- Click Transcribe to begin transcription. The resulting transcription will be saved as a
.transcription.txt
file.
This project is licensed under the MIT License. See the LICENSE
file for more details.
If you find this project useful, consider donating to support its development.
Thank you for your support!