Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About generated samples #4

Open
sunnnnnnnny opened this issue Jul 16, 2019 · 1 comment
Open

About generated samples #4

sunnnnnnnny opened this issue Jul 16, 2019 · 1 comment
Labels
question Further information is requested

Comments

@sunnnnnnnny
Copy link

“A PyTorch implementation of Robust Universal Neural Vocoding. Audio samples can be found here.“

The link you gave here is the sample you generated is the actual spectrum feed or the acoustic model predicted?

@bshall
Copy link
Owner

bshall commented Jul 16, 2019

Hi @sunnnnnnnny, the samples on the webpage are generated from the actual mel spectrogram. I haven't had the chance to experiment with something like tacotron yet but the model does seem to work reasonably well on "smoothed" spectrograms. For example, the following spec was reconstructed using an L2 loss and with a vector quantized autoencoder (VQVAE):
mel
Compared to the original:
orig
I've attached the audio generated by the reconstucted spectrogram. A little noiser than the original but not too bad (also, the VQVAE may be causing some loss of quality).
sample.zip
The audio corresponds to the first sample from speaker V002 on the webpage.

@bshall bshall added the question Further information is requested label Aug 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants