Name	Name	Last commit message	Last commit date
parent directory ..
Dockerfile	Dockerfile
README.md	README.md
config.json	config.json
icon.png	icon.png
logo.png	logo.png
run.sh	run.sh

OpenTTS (es)

Unifies access to multiple open source text to speech systems and voices for many languages.

Supports a subset of SSML that can use multiple voices and text to speech systems!

Listen to voice samples

View source code

Settings

cache_dir
- Directory to cache generated WAV files
- Leave empty to disable (default, HA already has a TTS cache)
debug
- If true, DEBUG messages are printed to the log
- Default is false
larynx_quality
- Default quality setting for the Larynx TTS system
- Default is "high", choices are "high", "medium", "low" (use "low" for Raspberry Pi)
larynx_denoiser_strength
- Amount to apply denoiser during Larynx TTS post-processing
- Default is 0.005 (higher value reduces noise, but distorts voice)
larynx_noise_scale
- Volatility of Larynx TTS vocalization
- Default is 0.667, range is 0-1. Higher values make the voice less monotone
larynx_length_scale
- Speed of Larynx TTS speech
- Default is 1.0, lower values are faster, higher values are slower

MaryTTS Compatible Endpoint

Use OpenTTS as a drop-in replacement for MaryTTS.

Add to your configuration.yaml file:

tts:
  - platform: marytts
    port: 5500
    voice:larynx:harvard

The voice format is <TTS_SYSTEM>:<VOICE_NAME>. Visit the OpenTTS web UI and copy/paste the "voice id" of your favorite voice here.

You may leave out the port setting if you configure the OpenTTS host port to be 59125 instead of 5500.

If your input text begins with a left angle bracket (<), it will be interpreted as SSML.

SSML

A subset of SSML is supported:

<speak> - wrap around SSML text
- lang - set language for document
<s> - sentence (disables automatic sentence breaking)
- lang - set language for sentence
<w> / <token> - word (disables automatic tokenization)
<voice name="..."> - set voice of inner text
- voice - name or language of voice
  - Name format is tts:voice (e.g., "glow-speak:en-us_mary_ann") or tts:voice#speaker_id (e.g., "coqui-tts:en_vctk#p228")
  - If one of the supported languages, a preferred voice is used (override with --preferred-voice <lang> <voice>)
<say-as interpret-as=""> - force interpretation of inner text
- interpret-as one of "spell-out", "date", "number", "time", or "currency"
- format - way to format text depending on interpret-as
  - number - one of "cardinal", "ordinal", "digits", "year"
  - date - string with "d" (cardinal day), "o" (ordinal day), "m" (month), or "y" (year)
<break time=""> - Pause for given amount of time
- time - seconds ("123s") or milliseconds ("123ms")
<sub alias=""> - substitute alias for inner text

Supported Text to Speech Systems

Below is a list of the supported TTS systems and voice counts by language.

Larynx
- English (27), German (7), French (3), Spanish (2), Dutch (4), Russian (3), Swedish (1), Italian (2), Swahili (1)
- Model types available: GlowTTS
- Vocoders available: HiFi-Gan (3 levels of quality)
- Patched embedded version of Larynx 1.0
Glow-Speak
- English (2), German (1), French (1), Spanish (1), Dutch (1), Russian (1), Swedish (1), Italian (1), Swahili (1), Greek (1), Finnish (1), Hungarian (1), Korean (1)
- Model types available: GlowTTS
- Vocoders available: HiFi-Gan (3 levels of quality)
Coqui-TTS
- English (110), Japanese (1), Chinese (1)
- Patched embedded version of Coqui-TTS 0.3.1
nanoTTS
- English (2), German (1), French (1), Italian (1), Spanish (1)
MaryTTS
- English (7), German (3), French (4), Italian (1), Russian (1), Swedish (1), Telugu (1), Turkish (1)
- Includes embedded MaryTTS
flite
- English (19), Hindi (1), Bengali (1), Gujarati (3), Kannada (1), Marathi (2), Punjabi (1), Tamil (1), Telugu (3)
Festival
- English (9), Spanish (1), Catalan (1), Czech (4), Russian (1), Finnish (2), Marathi (1), Telugu (1), Hindi (1), Italian (2), Arabic (2)
- Spanish/Catalan/Finnish use ISO-8859-15 encoding
- Czech uses ISO-8859-2 encoding
- Russian is transliterated from Cyrillic to Latin script automatically
- Arabic uses UTF-8 and is diacritized with mishkal
eSpeak
- Supports huge number of languages/locales, but sounds robotic

Voice Quality

On the Raspberry Pi, you may need to lower the quality of Larynx and Glow-Speak voices to get reasonable response times.

This can by done with the larynx_quality setting above (use "medium" or "low"), or by appending the vocoder name to the end of your voice:

tts:
  - platform: marytts
    voice:larynx:harvard;low

Available quality levels are "high", "medium", and "low".

Note that this only applies to Larynx and Glow-Speak voices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

opentts-es

opentts-es

README.md

OpenTTS (es)

Settings

MaryTTS Compatible Endpoint

SSML

Supported Text to Speech Systems

Voice Quality

Files

opentts-es

Directory actions

More options

Directory actions

More options

Latest commit

History

opentts-es

Folders and files

parent directory

README.md

OpenTTS (es)

Settings

MaryTTS Compatible Endpoint

SSML

Supported Text to Speech Systems

Voice Quality