You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using phonemizer (espeak-ng) there are digits to reflex the vowel/sound variants like the following:
text = 'Có lối ra, chúng ta qua đó xem sao.'
phonemizer.phonemize(
text,
language='vi',
backend='espeak',
strip=False,
preserve_punctuation=True,
punctuation_marks=';:,.!?¡¿—…"«»“”',
with_stress=True,
language_switch='keep-flags',
njobs=1
)
My question is the missing of numbers (here 7, 1) and spaces surround punctuation like comma as in zˈaː,tɕˈuɜŋ tˈaː
instead of zˈaː7 , tɕˈuɜŋ t̪ˈaː1 will affect the aligment and pause beetween generated words?
The text was updated successfully, but these errors were encountered:
Hi,
the whitespace collapse is a wanted effect, mostly to be able to control where the pauses are allocated with the forward model. You can remove this if you want by removing it from line 91 in data/text/tokenizer.py (return the line above). But I would discourage that, unless you're running into problems.
For the numbers issue, you can add the missing phonemes (for instance 1,2,3,4,5,,6,7,8,9,0) in data/text/symbols.py in all phonemes like so: all_phonemes = sorted(list(_phonemes) + list(_punctuations) + list('1234567890')
I was not aware that some languages had numbers as phonemes.
TODO: Add optional extra phonemes string to data_config.yaml
When using phonemizer (espeak-ng) there are digits to reflex the vowel/sound variants like the following:
output:
with
tokenizer._postprocess
:output:
Outputs placed together:
My question is the missing of numbers (here 7, 1) and spaces surround punctuation like comma as in
zˈaː,tɕˈuɜŋ tˈaː
instead of
zˈaː7 , tɕˈuɜŋ t̪ˈaː1
will affect the aligment and pause beetween generated words?The text was updated successfully, but these errors were encountered: