-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Did you use this repo to train a vocoder? #2
Comments
@syang1993 Hi, how's it going? Yeah, I'm having similar problems - here's what my conditioned model sounds like after 300k steps: (used 80-band mel-spectrograms) I haven't implemented the noise reduction, is that algorithm publicly available? I had a quick look around and couldn't find it. As for conditional sampling - I was going to implement a simple threshold or perhaps an exponential moving average from the summed values in the conditioning frames - and use that to differentiate between a voiced/unvoiced state. But haven't got around to it yet so perhaps that's why it doesn't sound so good. I'm curious what your implementation sounds like - any chance you could post a sample? |
@fatchord I also used the 80-band mel-spectrogram to train my model. Since the author cited a book for noise reduction, I don't know what specific method they use, maybe the wiener filtering? Since I'm on a summer vocation, I can't send you my samples. But you can listen the |
@fatchord this is not bad at all, although I know the goal is to replicate the paper quality results. |
@fatchord Hi, happy to see you again! I'm also working on the FFTNet. But in my experiments, I cannot get the similar results of the paper's demo page, mainly about conditional sampling and post-denoising. Do you try to reconstruct their results? Thanks.
The text was updated successfully, but these errors were encountered: