This repo contains the LaTeX source, figures, and the code that tested statistical significance of the differences between ASR systems for the following paper:
Mark Hasegawa-Johnson, Preethi Jyothi, Daniel McCloy, Majid Mirbagheri, Giovanni di Liberto, Amit Das, Bradley Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Edmund C. Lalor, Nancy Chen, Paul Hager, Tyler Kekona, Rose Sloan, and Adrian KC Lee,, ``ASR for Under-Resourced Languages from Probabilistic Transcription,'' {\em IEEE/ACM Trans. Audio, Speech and Language} 25(1):46-59, 2017 (Print ISSN: 2329-9290, Online ISSN: 2329-9304, Digital Object Identifier: 10.1109/TASLP.2016.2621659)
The Kaldi source is here: https://github.com/ws15code/SBS-mul
The EEG experiments are here: https://github.com/ws15code/prob-trans/tree/master/EEG
PTgen, the tool that generates probabilistic transcriptions from mismatched crowdsourcing inputs, is here: https://github.com/uiuc-sst/PTgen