A Maubot to transcribe audio messages in matrix rooms using local open-source libraries
- FFmpeg must be in
$PATH
- Activate the maubot virtual environment (
source ./bin/activate
), and run - Download
maulocalstt
from the releases (or download the repository and build withmbc build
), and upload it to maubot. - Download a model for your backend:
- For wisper, download a model from https://huggingface.co/ggerganov/whisper.cpp and place it under
models/whispercpp
- For vosk, download a zipped model from https://alphacephei.com/vosk/models and unpack it into
models/vosk
- For wisper, download a model from https://huggingface.co/ggerganov/whisper.cpp and place it under
- Create an instance of the bot, and update the configuration:
- For whisper, specify
model_name
- the name of the model you downloaded (the name of the file without theggml-
and.bin
)language
- the language the audio will be in (you can set it toauto
for whisper to auto-detect the language)translate
- if you want wisper to translate the transcription to english (true
orfalse
)
- For vosk, specify
model_path
- the path to the top directory of the model you downloaded (the one with the foldersam
conf
graph
etc.), either absolute or related to maubot's working directory.
- For whisper, specify
Simply invite the bot to a room, and it will reply to all audio messages with their transcription