Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The model for Chinese speech doesn't work after training for 1700 epochs #7

Open
KevinWang676 opened this issue Sep 22, 2023 · 3 comments

Comments

@KevinWang676
Copy link

KevinWang676 commented Sep 22, 2023

Hi, I trained the model on a Chinese dataset (~13min high-quality Chinese speech), and after 1700 epochs I still could not infer from the model (there was no sound at all). I used chinese_cleaners and followed your instruction. So I wonder which step might go wrong. Should I continue to train the model for more epochs? Thank you!

@KevinWang676 KevinWang676 changed the title The model doesn't work after training for 1700 epochs The model for Chinese speech doesn't work after training for 1700 epochs Sep 22, 2023
@FENRlR
Copy link
Owner

FENRlR commented Sep 23, 2023

Based on your fork, I've found that you have correct symbols for the cleaner.

But I've also found that you have identical texts for train and validation text. The datasets in validation text should be fewer, and unique from those of training text (Something like 1,2,3,4,5,6,7 for training_text and 8,9,10 for val_text while having a total of 10 datasets.). I suggest split like 5~10 samples from your training text and then pasting it to validation one.

In addition, you have "eval_interval": 100 and "epochs": 1000 and had last update like 13 hours ago. Maybe you meant 1700 steps instead of epochs I guess?

@KevinWang676
Copy link
Author

KevinWang676 commented Sep 23, 2023

Thanks for your reply! I wonder why the identical texts for training and validation would affect the training outcome. Also, do you think I need to adjust your inference.ipynb when infering Chinese texts? Maybe the inference step goes wrong for Chinese texts (maybe it didn't clean Chinese texts correctly so that the function vcss didn't work). Thank you.

P.S. I did train the model for 1700 epochs since I changed the config.json when training.

@FENRlR
Copy link
Owner

FENRlR commented Sep 23, 2023

Indeed yes. Actually, I forgot to update the notebook version of inference.
You can comment out langdetector like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants