Automatically transcribe and summarize lecture recordings completely on-device using AI.
Install Ollama.
Create a virtual Python environment:
python3 -m venv .venv
Activate the virtual environment:
source .venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Edit lecsum.yaml
:
Field | Default Value | Possible Values | Description |
---|---|---|---|
whisper_model |
"base.en" | Whisper model name | Specifies which Whisper model to use for transcription |
ollama_model |
"llama3.1:8b" | Ollama model name | Specifies which Ollama model to use for summarization |
prompt |
"Summarize: " | Any string | Instructs the large language model during the summarization step |
Run the Ollama server:
ollama serve
In a new terminal, run:
./lecsum.py -c [CONFIG_FILE] [AUDIO_FILE]
Use any file format supported by Whisper (mp3
, mp4
, wav
, webm
, etc.).
To start the lecsum
server in a development environment, run:
fastapi dev server.py