Any plan for WORLD vocoder? #7

lsq357 · 2017-12-04T03:59:50Z

Any plan for WORLD vocoder for Multi-Speaker TTS

r9y9 · 2017-12-04T04:13:33Z

Not currently planned. I wish I had more time..

r9y9 · 2017-12-13T13:07:31Z

I will leave this open to track progress on it. Not currently planned, though.

r9y9 · 2017-12-16T16:39:17Z

Seems like there's a folk trying to support WORLD vocoder. https://github.com/geneing/deepvoice3_pytorch

DarkDefender · 2017-12-17T16:17:24Z

@r9y9 Thanks for the heads up!

I'm actually really interested in how this turns out. As the WORLD vocoder is used in the "UTAU" music software. If one managed to make the network be able to train successfully with this then I think we might be able to get rid of the "sound compression" artifacts that is present in most of the current deepvoice/tacotron implementations...

And example of the sound quality possible with UTAU (and therefore WORLD):
https://www.youtube.com/watch?v=Es_5kvVtiNA

@geneing would you mind keeping us updated with your progess? Even if the results are not good.

geneing · 2017-12-18T07:09:46Z

Replacing Griffin-Lim with World vocoder seems to be fairly straightforward. Full transform for 22KHz signal is length 1027 vs 80 for mel output. World vocoder includes an encoder for aperiodicity and spectrogram, which reduces output to length of 131.

lsq357 · 2017-12-18T10:04:24Z

In my view, using WORLD vocoder, the network only changes the output shape and adds multi-output, which WORLD vocoder need at least three parameters(f0, aperiodicity, spectrogram).
Moreover, it can add WORLD parameters(f0, aperiodicity spectrogram) and mel-outputs to loss function which speed convergence.(the idea is my guess!)

DarkDefender · 2017-12-20T12:56:11Z

BTW if anyone is interested in singing neural networks. Then I just found this:
http://www.dtic.upf.edu/~mblaauw/NPSS/

The spanish output sounds really awesome I think. The english and japanese sounds a little bit too stilted. But I guess that depends on what kind of dataset and music you throw at it.

Edit: forgot to mention that it seems to use the WORLD vocoder

geneing · 2017-12-22T17:15:11Z

In the view of the Tacotron 2 paper, it appears that WaveNet may be a better choice. Looking into it.

lsq357 · 2017-12-23T03:10:10Z

It needs much more GPUs to train Wavenet for me(in Tacotron 2 use 32 GPUs ).
And WORLD vocoder can use only in cpu.

r9y9 · 2017-12-23T03:13:42Z

Does anybody have experience working on WaveNet? Is it impossible to train WaveNet with only 1 GPU in practice?

lsq357 · 2017-12-23T09:26:39Z

I experience WaveNet on two 1080Ti GPUs, it only train 3k+ steps(asyn update) each day.,batch size =32.

I try QuasiRNN + WaveNet in DeepVoice2 or DeepVoice, but my tensorflow code of QuasiRNN not speed up!
I only train a week, and not sucess.

r9y9 · 2017-12-31T05:42:57Z

I started to implement the WaveNet vocoder. Check out r9y9/wavenet_vocoder#1 (comment) if you are interested.

MlWoo · 2018-07-10T10:38:26Z

@geneing Have you trained your model with "world"? Could you provide some audio samples?

stale · 2019-05-30T01:34:36Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

hash2430 · 2019-07-15T05:47:49Z

I made one myself.
https://github.com/hash2430/dv3_world
Anyone who needs it are welcome to use.
I will upload sample audios soon.

lsq357 closed this as completed Dec 4, 2017

r9y9 reopened this Dec 13, 2017

stale bot added the wontfix label May 30, 2019

stale bot closed this as completed Jun 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any plan for WORLD vocoder? #7

Any plan for WORLD vocoder? #7

lsq357 commented Dec 4, 2017

r9y9 commented Dec 4, 2017

r9y9 commented Dec 13, 2017

r9y9 commented Dec 16, 2017

DarkDefender commented Dec 17, 2017

geneing commented Dec 18, 2017

lsq357 commented Dec 18, 2017

DarkDefender commented Dec 20, 2017 •

edited

Loading

geneing commented Dec 22, 2017

lsq357 commented Dec 23, 2017 •

edited

Loading

r9y9 commented Dec 23, 2017

lsq357 commented Dec 23, 2017 •

edited

Loading

r9y9 commented Dec 31, 2017

MlWoo commented Jul 10, 2018

stale bot commented May 30, 2019

hash2430 commented Jul 15, 2019

Any plan for WORLD vocoder? #7

Any plan for WORLD vocoder? #7

Comments

lsq357 commented Dec 4, 2017

r9y9 commented Dec 4, 2017

r9y9 commented Dec 13, 2017

r9y9 commented Dec 16, 2017

DarkDefender commented Dec 17, 2017

geneing commented Dec 18, 2017

lsq357 commented Dec 18, 2017

DarkDefender commented Dec 20, 2017 • edited Loading

geneing commented Dec 22, 2017

lsq357 commented Dec 23, 2017 • edited Loading

r9y9 commented Dec 23, 2017

lsq357 commented Dec 23, 2017 • edited Loading

r9y9 commented Dec 31, 2017

MlWoo commented Jul 10, 2018

stale bot commented May 30, 2019

hash2430 commented Jul 15, 2019

DarkDefender commented Dec 20, 2017 •

edited

Loading

lsq357 commented Dec 23, 2017 •

edited

Loading

lsq357 commented Dec 23, 2017 •

edited

Loading