-
🐛 BugNoise at the end of produced wave file To ReproduceSteps to reproduce the behavior:
import os
import torch
device = torch.device('cpu')
torch.set_num_threads(4)
local_file = 'model.pt'
if not os.path.isfile(local_file):
torch.hub.download_url_to_file('https://models.silero.ai/models/tts/ru/v2_kseniya.pt',
local_file)
model = torch.package.PackageImporter(local_file).load_pickle("tts_models", "model")
model.to(device)
example_batch = ['В недрах тундры выдры в г+етрах т+ырят в вёдра ядра кедров.',
'Котики - это жидкость!',
'М+ама М+илу м+ыла с м+ылом.']
sample_rate = 16000
audio_paths = model.save_wav(texts=example_batch,
sample_rate=sample_rate)
Expected behaviorNo padding with noise EnvironmentPlease copy and paste the output from this
Additional contextThanks a lot for creating this! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
This is a known tacotron bug Batch size 1 fixes this (i.e. just feed 1 phrase) We basically have 2 options how to fix this more properly
|
Beta Was this translation helpful? Give feedback.
-
in future though we will stop using tacotron, so the maybe just using batch_size = 1 or just applying a VAD is good temporary solution (especially when you run on CPU) |
Beta Was this translation helpful? Give feedback.
This is a known tacotron bug
Happens due to batching
Batch size 1 fixes this (i.e. just feed 1 phrase)
We basically have 2 options how to fix this more properly
silero-vad
or WebRTC)