Replies: 1 comment
-
This is interesting, normally we find RNNT to be slower but more accurate. What languages are being trained on? Maybe if they are Chinese, Japanese Korean, the vocab size causes slower learning ? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
so i tried two models
Conformer-CTC and Conformer-RNNT on the same dataset. though there are many documents on NVIDIA docs which says that RNNT model is better than CTC. but through all my experiments I am getting almost same results for them .
Also training RNNT is costing me latency.
RNNT takes more time in inference than CTC though RNNT is made for streaming tasks.
attaching some observations :
Dataset-1:
val_wer graphs : (purple is RNNT )
inference time in s : (Purple is RNNT)
Also one thing I noticed is the inference time gap varies with different dataset:
Dataset-2:
Val_wer( purple is RNNT)
inference time:
both the datasets contains multilingual audios and corresponding transcripts in Romanic form.
Beta Was this translation helpful? Give feedback.
All reactions