UPDATE SEPTEMBER 2022

The phoneme dictionary was extended. A VITS model trained on speaker data of "Hokuspokus Clean" was added.

A multispeaker model by NVIDIA was added (https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/tts_de_fastpitch_multispeaker_5)

Text To Speech Inferencing Webservice based on Tacotron 2 and Multi-Band MelGAN, trained using the HUI-Audio-Corpus-German, evaluated in Neural Speech Synthesis in German. Try it out at http://narvi.sysint.iisys.de/projects/tts. Requirements:

Linux-based OS (Ubuntu 18+, Debian9, Centos7)
libfreetype6-dev
pkg-configure
Python >= 3.8
python3-dev (for your respective version)
libsndfile
sox/ffmpeg

PyTorch may need to be installed separately (see https://pytorch.org/get-started/locally/)

Preparation: Create virtual environment, install requirements Open a python interpreter session in the previously generated virtual environment and run:

import nltk
nltk.download('punkt')

Before the TTS models can be used, download them from https://opendata.iisys.de/systemintegration/Models/speakers.tar.gz and extract them to tts_inferencer/speakers

Before the STT models can be used, download it from https://opendata.iisys.de/systemintegration/Models/asr_models.zip and extract them to asr_inferencer/models

To start the server in debug settings, run "python3 app.py". Access it at http://127.0.0.1:5000.

Further Notes:

If symbolic links for tacotron2 models are broken, recreate them using "ln -s <checkpoint.pth> train.loss.best.pth" in the respective speakers//tacotron2 directories.

Keep in mind, this service does not include number normalization yet, so do not input any digits (2 -> zwei).

The incorporated ASR model was taken from https://github.com/AASHISHAG/deepspeech-german, check out their work: https://www.researchgate.net/publication/336532830_German_End-to-end_Speech_Recognition_based_on_DeepSpeech.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
asr_inferencer		asr_inferencer
static		static
templates		templates
tts_inferencer		tts_inferencer
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
app.py		app.py
dockerfile		dockerfile
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UPDATE SEPTEMBER 2022

About

Releases

Packages

Contributors 2

Languages

License

iisys-hof/tts_webservice

Folders and files

Latest commit

History

Repository files navigation

UPDATE SEPTEMBER 2022

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages