Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To Much Noise on Mandarin #498

Closed
Johnzxf opened this issue Aug 20, 2020 · 8 comments
Closed

To Much Noise on Mandarin #498

Johnzxf opened this issue Aug 20, 2020 · 8 comments

Comments

@Johnzxf
Copy link

Johnzxf commented Aug 20, 2020

I have synthesize on Mandarin.But it seems too much noise,listen to the audio below. How can I do to decrease the noise. Also, it seems synthesize femal audio is better than male.
BTW: the audio was synthesizer from texts.

链接:https://pan.baidu.com/s/1jXy9-6KnBKVciapcbGZmFw
提取码:knjb

@ghost
Copy link

ghost commented Aug 20, 2020

Not enough info provided. I can't really help with this but if you answer the following questions it will improve your chances of getting a helpful response.

  1. Did you use the code from this repo? If so what modifications did you make?
  2. Which dataset did you train on?
  3. What are your settings? Attach the synthesizer/hparams.py file
  4. Which vocoder are you using?

Also you should make a zip file of your audio samples and attach them here, many of us here can't download from pan.baidu.com.

@Johnzxf
Copy link
Author

Johnzxf commented Aug 20, 2020

Not enough info provided. I can't really help with this but if you answer the following questions it will improve your chances of getting a helpful response.

  1. Did you use the code from this repo? If so what modifications did you make?
  2. Which dataset did you train on?
  3. What are your settings? Attach the synthesizer/hparams.py file
  4. Which vocoder are you using?

Also you should make a zip file of your audio samples and attach them here, many of us here can't download from pan.baidu.com.

1: Yes, I use this repo. I did not change anything on the model. The most modifications is the preprocess of the dataset.
2: The dataset I have trained on is internal, and the audio is clean. The audio in my dataset is not as longer as Libirspeech.Most of the audio is between 3-5s.
3:Attach is setting.
4: The vocoder is WaveRNN.

During training synthesizer, I found the aligments has some gap,bteween encoder and decoder. Also see the attachment. Is it has inflence?

a5b7a944a07fea7964d8c1830649507

SV2TTS.zip
hparams.txt

@Johnzxf Johnzxf closed this as completed Aug 25, 2020
@ghost
Copy link

ghost commented Aug 25, 2020

@Johnzxf I'm sorry you didn't get a response. Did you figure out what was causing the noise?

@Johnzxf
Copy link
Author

Johnzxf commented Aug 28, 2020 via email

@ghost
Copy link

ghost commented Aug 28, 2020

Let's reopen and see if anyone knows why this is the case. Are you using RTVC (this repo) or zhrtvc?

@ghost ghost reopened this Aug 28, 2020
@lawrence124
Copy link

@blue-fish
i have not tried the waveRNN vocoder on RTVC, but the pretrained waveRNN vocoder can't give me anything useful. The melgan multi speaker (or other melgan vocoder) yield far better result. I'm not too sure why though. (do we need to train vocoder for a specific language ??)

@ghost
Copy link

ghost commented Sep 12, 2020

@lawrence124 A vocoder trained on enough speakers can generalize to unseen speakers and even other languages. mozilla/TTS#221 (comment)

The vocoders work within very narrow parameters and will fail if input does not meet the specification. To avoid incompatibility, the vocoder in this repo gets some of its hparams from the synthesizer. The relevant parameters are:

  • sample_rate
  • num_mels
  • n_fft, hop_length, window_length
  • fmin, fmax
  • pre-emphasize

With default settings, I got very mediocre results on zhrtvc and I suspect that some of the vocoder parameters are not set correctly. Griffin-Lim performed the best. You should try this repo with the English models, the pretrained WaveRNN works quite well.

@ghost
Copy link

ghost commented Oct 6, 2020

Closing this issue due to inactivity. Would like to know what is causing the noise if you find out.

@ghost ghost closed this as completed Oct 6, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants