Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bacth size relation with memory #587

Closed
quirijnve opened this issue Nov 2, 2020 · 8 comments
Closed

Bacth size relation with memory #587

quirijnve opened this issue Nov 2, 2020 · 8 comments

Comments

@quirijnve
Copy link

Hi,
while training the synthesizer with batch size 32 I get an error after 30 steps on average with the error of my GPU ran out of memory.
I tried reducing the batch size to 32 but I'm still getting an error.
what are other things I can do, can I go with a batch size below 32?

@quirijnve
Copy link
Author

what is the best way of stopping the training so you can test if the model is any good?
and how van you continue to train on it again later

@ghost
Copy link

ghost commented Nov 2, 2020

All of the utterances in the training batch are padded to match the length of the longest one. To reduce memory consumption you can also reduce hparams.max_mel_frames. I've had good results with 500 and 600. You can use this code to modify your SV2TTS/synthesizer/train.txt to remove utterances that are too long. I have successfully trained a synth model with a batch size of 12 (see #538), so you can go much lower if needed.

You can stop the training at every time, but you will lose all progress since the last saved checkpoint. Use --checkpoint_interval to save more frequently if needed. Restart training with the same python command, if it finds a saved checkpoint it resumes by default.

@ghost ghost closed this as completed Nov 2, 2020
@quirijnve
Copy link
Author

Thank you very much!
its training a dutch model right now.
thank you for all the support!

@quirijnve
Copy link
Author

I preproccesed and embedded with 'with metadata_fpath.open("r", encoding="latin-1") as metadata_file:'
because that recognized all my characters, do i have to change somthing in the hparams.py for the training because right now with a loss of 0.9 i have random synthesizer results, just white noice.

@quirijnve
Copy link
Author

but the plots I received are looking way better than in the toolbox
step-4000-mel-spectrogram
that one is after 4K steps, the toolbox plots are just blue or completely green

@quirijnve
Copy link
Author

I think im importing it wrong because the wavs folder gives me some what understandable audio files.
for importing the synthesizer to the toolbox it is just selecting the name from the dropdown menu thingie right?

@ghost
Copy link

ghost commented Nov 2, 2020

Tacotron uses "teacher forcing" training which means the decoder cheats and uses the ground truth mel to predict subsequent frames. It's necessary to ensure the input and output have same length which facilitates a loss calculation. For inference you do not have a ground truth, so the model has to rely on what it learned during training. Toolbox output with your new model will not be intelligible until about 20-50k steps. You need 150-250k steps not to have serious bugs during inference.

A few words about dataset quality: if your dataset is good, it may converge in fewer steps. And if it is bad, your inference will never be good. Many free datasets are bad quality.

I will recommend that you modify synthesizer/utils/symbols.py to contain all letters of your alphabet. Anything that is not in that list will be ignored by the model, and make it very hard to generalize.

There's a learning curve to this. Since you are training a new model, it will be very helpful to study the code carefully and figure out what it's doing. If there is something you don't understand, read the papers linked in the README for context.

@quirijnve
Copy link
Author

Thank you very much!

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant