About generated samples #4

sunnnnnnnny · 2019-07-16T02:10:39Z

“A PyTorch implementation of Robust Universal Neural Vocoding. Audio samples can be found here.“

The link you gave here is the sample you generated is the actual spectrum feed or the acoustic model predicted?

bshall · 2019-07-16T08:08:37Z

Hi @sunnnnnnnny, the samples on the webpage are generated from the actual mel spectrogram. I haven't had the chance to experiment with something like tacotron yet but the model does seem to work reasonably well on "smoothed" spectrograms. For example, the following spec was reconstructed using an L2 loss and with a vector quantized autoencoder (VQVAE):

Compared to the original:

I've attached the audio generated by the reconstucted spectrogram. A little noiser than the original but not too bad (also, the VQVAE may be causing some loss of quality).
sample.zip
The audio corresponds to the first sample from speaker V002 on the webpage.

bshall added the question Further information is requested label Aug 22, 2019

francislata mentioned this issue Nov 9, 2019

Generating samples from generated Mel-spectrograms #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About generated samples #4

About generated samples #4

sunnnnnnnny commented Jul 16, 2019

bshall commented Jul 16, 2019

About generated samples #4

About generated samples #4

Comments

sunnnnnnnny commented Jul 16, 2019

bshall commented Jul 16, 2019