Skip to content

Commit

Permalink
Spanish support.
Browse files Browse the repository at this point in the history
  • Loading branch information
raccoonML committed Dec 21, 2021
1 parent ea2c0da commit e7aa707
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 6 deletions.
8 changes: 4 additions & 4 deletions synthesizer/hparams.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,16 +33,16 @@ def parse(self, string):
preemphasize = True,

### Tacotron Text-to-Speech (TTS)
tts_embed_dims = 512, # Embedding dimension for the graphemes/phoneme inputs
tts_embed_dims = 256, # Embedding dimension for the graphemes/phoneme inputs
tts_encoder_dims = 256,
tts_decoder_dims = 128,
tts_postnet_dims = 512,
tts_postnet_dims = 256,
tts_encoder_K = 5,
tts_lstm_dims = 1024,
tts_lstm_dims = 512,
tts_postnet_K = 5,
tts_num_highways = 4,
tts_dropout = 0.5,
tts_cleaner_names = ["english_cleaners"],
tts_cleaner_names = ["transliteration_cleaners"],
tts_stop_threshold = -3.4, # Value below which audio generation ends.
# For example, for a range of [-4, 4], this
# will terminate the sequence at the first
Expand Down
3 changes: 1 addition & 2 deletions synthesizer/utils/symbols.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@

_pad = "_"
_eos = "~"
_characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz!\'\"(),-.:;? "

_characters = "'-aábcdeéfghiíjklmnñoópqrstuúüvwxyz "
# Prepend "@" to ARPAbet symbols to ensure uniqueness (some are the same as uppercase letters):
#_arpabet = ["@' + s for s in cmudict.valid_symbols]

Expand Down

1 comment on commit e7aa707

@antmir72
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified hparams.py and symbols.py like above - in order to create an italian synthesizer.pt (I just omitted the ñ and ü symbols in my _characters string)

When I run synthesizer_train.py I get this error:

SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xe1 in position 3: invalid continuation byte

and the line 11 of symbols.py appears to have been read as

_characters = "'-a�bcde�fghi�jklmn�o�pqrstu�vwxyz "

How did you avoid this issue?

Please sign in to comment.