Replies: 3 comments
-
Hi, We train them on proprietary datasets collected by us. |
Beta Was this translation helpful? Give feedback.
-
thanks for info. I have trained your CE model on LibriSpeech with the native PyTorch CTC loss and the novograd optimizer on mel spectrograms, it can reach in 100 GPU hours around test-clean 9% WER and test-other 22% WER. It sounds reasonable if your CE models were trained on proprietary datasets. |
Beta Was this translation helpful? Give feedback.
-
note that you can essentially view ML models as compression algorithms I especially like this chart adapted from here you can get better results if you limit the domain (one common trick everybody abuses - is training huge domain-limited LMs that are orders of magnitude large than text corpora themselves), but this is no the purpose of our project our purpose is to pack all the knowledge into one compact generalist model which is difficult |
Beta Was this translation helpful? Give feedback.
-
On which datasets were the released English and German CE models trained?
thanks
Beta Was this translation helpful? Give feedback.
All reactions