Better audio quality with larger resnet #65

cschaefer26 · 2021-05-28T09:34:33Z

Hi, great repo!

I found that the audio quality improves considerably with a slightly increased ResNet as suggested in https://arxiv.org/pdf/2005.05106.pdf. The shaky and metallic artefacts are reduced a lot.

Here is a comparison of your pretrained LJSpeech with a current model I am still training (for TTS I used https://github.com/as-ideas/ForwardTacotron)

Original (6400 epochs):
https://drive.google.com/file/d/1LOIB9B7LDX9g-kVu_p1anGJgJ5vjE27s/view?usp=sharing

Larger ResNet (2000 epochs):
https://drive.google.com/file/d/19_d2SQU1xZi-o90MJ8NcKhIS6AFwliH-/view?usp=sharing

If you are interested I could open a PR making the layers more flexible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better audio quality with larger resnet #65

Better audio quality with larger resnet #65

cschaefer26 commented May 28, 2021 •

edited

Loading

Better audio quality with larger resnet #65

Better audio quality with larger resnet #65

Comments

cschaefer26 commented May 28, 2021 • edited Loading

cschaefer26 commented May 28, 2021 •

edited

Loading