Audio to Text Transcription App

This small Python app uses OpenAI's Whisper model to transcribe audio files to text. It converts audio to mono and 16kHz before processing and returns the transcription. The app supports various audio formats and ensures the file is optimized for transcription.

Project Structure

├── README.md
├── assets
│   ├── harvard.wav
│   └── jackhammer.wav
└── audio-to-text.py

Note: A .env Note: A .env file is used to store the OpenAI API key and is not included in this repository for security reasons.

Requirements

Python 3.7+
OpenAI Python package
pydub
FFmpeg (required by pydub for audio processing)

Installation

Clone the repository:

git clone https://github.com/ivansing/audio-to-text-app.git
    cd audio-to-text-app

Install dependencies:

pip install openai pydub python-dotenv

Install FFmpeg:
- On macOS (using Homebrew): brew install ffmpeg
- On Ubuntu: sudo apt install ffmpeg
- On Windows: Download and install from ffmpeg.org
Set up your OpenAI API key:
- Create a .env file at the root of the project and add your API key:
```
OPENAI_API_KEY=your-api-key-here
```

Usage

Add your aduio file to the assets directory or use the provided samples wav files (e.g., jackhammer.wav).
Run the audio-to-text.pyscript to convert and transcribre your aduio file:

python3 audio-to-text.py

The transcription will be prited in the console.

How it works

Convert Audio: The script first converts the input audio file to mono and resamples it to 16kHz using pydub.
Transcription: It then sends the processed audio to OpenAI's Whisper model for transcription.
Output: The transcribed text is printed.

License

This project is licensed under the MIT License.

Acknoledgements

OpenAI Whisper API
pydub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Audio to Text Transcription App

Project Structure

Requirements

Installation

Usage

How it works

License

Acknoledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Audio to Text Transcription App

Project Structure

Requirements

Installation

Usage

How it works

License

Acknoledgements