Here, we list a collection of research articles that utilize the NeMo Toolkit. If you would like to include your paper in this collection, please submit a PR updating this document.
2023
2021
- Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition
- SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
- CarneliNet: Neural Mixture Model for Automatic Speech Recognition
- CTC Variations Through New WFST Topologies
- A Toolbox for Construction and Analysis of Speech Datasets
2020
2019
2022
2021
2022
--------
2021
- TalkNet: Fully-Convolutional Non-Autoregressive Speech Synthesis Model
- TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
- Hi-Fi Multi-Speaker English TTS Dataset
- Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings