-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
num_steps of training for those demo sample? #5
Comments
Hi @bayesrule, Thanks! The audio on the demo page is generated with the pretrained model I uploaded which was only trained for 100k steps. I was also surprised by how quickly it trains. You get intelligible samples by 20k steps and decent results by 60k-80k steps. I've noticed that generated audio for the out-of-domain speakers are a bit noisy. I'm not sure if longer training times would help with that or if it is a limitation with the ZeroSpeech dataset (which is pretty noisy). |
Hi @bshall,
|
Hi @te0006 https://tarepan.github.io/UniversalVocoding/ Dataset: total 10 hours utterances In my impression, RNN_MS is surprisingly fast and robust. |
Hello, thanks for replying so quickly. For such a short training run (5hrs/60ksteps) your results certainly sound impressive. I think training time is often neglected in publications even though it is critically important for people looking to integrate/adapt a method, where you want to be able to try and fiddle with parameters without prohibitive computational cost. BTW your last, English sound example seems to exhibit considerably more noise and distortion than the Japanese ones (but perhaps, not speaking the language and thus not being used to hearing it, I simply cannot hear the artifacts in the Japanese examples). Do you already have experience w.r.t how far (and how fast) the speech quality improves with more training time? |
Many reproducible experiments (including this repository) kindly give information of training time. I agree with you and hope papers itself give the information too. Your hearing is correct. Not yet, but I will. |
Hi,
This repo is really great. May I ask the number of training steps (with batch_size 32) required for your demo samples? Given the amount of training data used here (around 26 hours recordings), I guess the 100k num_steps as provided in the config.json is not enough, right?
Many thanks!
The text was updated successfully, but these errors were encountered: