-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New-Model] HiFi-GAN implementation #661
Comments
Great @rishikksh20 👍 . Looking forward to it, because i'm interested in training HifiGAN for my (Mozilla) Tacotron2 DCA trained model. If it's helpful you could use my public german dataset for testing. |
@erogol @thorstenMueller Sure, I check on German dataset. And I guess, I did something terrible right because my implementation is trained 30% faster
I still training my repo on different datasets, I do modify my code bit which make quality worse. So far this commit tree |
@rishikksh20 sounds great, did you also try the "V2" version? So setting this https://github.com/jik876/hifi-gan/blob/4769534d45265d52a904b850da5a622601885777/config_v1.json#L13 to 128, as far as I see that's the only difference. I've been training the official HifiGAN repo for ages on one GPU but never really got close to the official models and definitely worse than my current MelGAN setup. Think on one 11GB GPU I'd probalby have to train it 2 months :) |
@m-toman yes HifiGAN is too slow to train, although I think after 1.5 M steps (12 days on V100) of training quality more or less similar in V1 version, still 12 days on V100 is quite huge time. I tried V2 version of official HifiGAN repo but convergence time is somewhat similar but quality is much more worse similar case for V3 because V1, V2 and V3 all share same discs and discs of hifigan is too slow to train. |
Thanks, I meant if you tried some V2 style setting with your implementation. V1 seems to be much slower than V2 on CPU |
Interesting! Why your model is much smaller than the official one? you have smaller discriminators? The official model is indeed very large. |
Which voice corpus is listen here? |
My custom dataset |
Continues there coqui-ai/TTS#16 |
@rishikksh20 is kind to integrate his own work into TTS.
for more details: https://github.com/rishikksh20/HiFi-GAN
The text was updated successfully, but these errors were encountered: