Releases: pnfo/sinhala-tts-dataset
Releases · pnfo/sinhala-tts-dataset
Single speaker training from v2.0 dataset, Model and config
After installing the coqui tts run the following command with the text to generate and model and config from this release
For converting sinhala text to roman letters to be used in tts generation you can use sinhalaToRomanConvert
function from this package.
!tts --text "atha kho bhagavā tassa sattāhassa accayena tamhā samādhimhā vuṭṭhahitvā rattiyā ." \
--model_path checkpoint_80000.pth \
--config_path config.json \
--out_path generated-clip.wav
13.8 hours dataset
- min length 1 second
- replace multiple dots with a single dot
- trim audio 0.2% and max silence in the middle 0.75
13.7 hours of multi speakers dataset
Two speakers
- no trimming of silences
- min length 2 seconds
The trained model checkpoint (single speaker) is attached here with the config file. This was used for the demo samples. You can use this to generate speech using the instructions here. The voice belongs to a respected buddhist monk and permission is only granted for non-obscene, non-offensive speech generation.
Removing the end silence and clipping middle silences to 0.75
v1.2 small change