-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to train models using Multilingual LibriSpeech #879
Comments
In my limited experience with MLS, the audio files were not cut well and often stopped in the middle of a word or contained extra sounds. This is probably an issue of the automatic segmentation method used by the dataset authors. The extra sounds caused problems when training the stop token prediction. |
So it's useless as training data? |
It can still be used, but if this is your first model from scratch, find a different dataset. Also don't forget to update symbols.py with characters not found in the English alphabet.
|
Hello! I have been all day reading the issues for getting to know what to do. I want to train on Spanish datasets tux-100h and Common Voice, which were proportioned in some of the past issues that refer to different languages. These days I will try to figure out how to order the datasets to have the same structure as LibriSpeech. If you have any handy code that could help with making the .txt from the transcript.txt it would be very useful, since first I will play around with train-clean-100 to get familiar with the process as you suggested in another issue. Thanks for all the effort, I found this experiment yesterday and I have seen how much you contributted to mantain it alive. |
@AlexSteveChungAlvarez https://gist.github.com/blue-fish/11552e89e95f32c14a370935c58f426c |
I have also shared modifications to support audio preprocessing of the compressed .opus files: https://github.com/blue-fish/Real-Time-Voice-Cloning/commit/b4e6c11c429bc6f8cdd86c048cba32413ab4109e |
Thank you very much! I will be using it this weekend.
I've found out that the program itself also accepts .ogg files (most of the audios sent via Whatsapp). |
Ah perfect timing. Was just deciding between ogg and m4a before turnin it into wav, but if ogg will work without change that’s an easy choice. Thanks from over here as well. Everything running really smoothly. Gonna start training now though for a different language, so hope things stay that way 🤞 Are there any other resources besides the research result page and the data in the repo? This repo seems weirdly underused... |
I passed the entire Friday searching for more updated repos which have their own papers, I took a look at mozilla's, tacotron's, tacotron2's, among others based on those repos...but for all of them you need a dataset to train the vocoder (or at least, that's what I understood from their documentation and discussions). With the code in this repo you only need one sample to hear a very similar voice to the target voice you want to clone. Which language are you going to train? It would be very helpful if you share your experience after doing it since I will start training with Spanish this weekend and I have seen that many more people wanted to do so, but they haven't shared their experiences after doing it (if they finally did it). |
I can offer the following observations for MLS Spanish:
|
Closing inactive issue. |
I want to train my own model using dataset from this website.
How can I adopt it to train my own model from it?
The text was updated successfully, but these errors were encountered: