diff --git a/egs2/README.md b/egs2/README.md index c871591d867..185a20b875d 100755 --- a/egs2/README.md +++ b/egs2/README.md @@ -16,20 +16,21 @@ See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2 | an4 | CMU AN4 database | ASR/TTS | ENG | http://www.speech.cs.cmu.edu/databases/an4/ | | | babel | IARPA Babel corups | ASR | ~20 languages | https://www.iarpa.gov/index.php/research-programs/babel | | | bn_openslr53 | Large bengali ASR training dataset | ASR | BEN | https://openslr.org/53/ | | -| bur_openslr80 | Burmese ASR training dataset | ASR | BUR | https://openslr.org/80/ -| catslu | CATSLU-MAPS | SLU | CMN | https://sites.google.com/view/catslu/home | | +| bur_openslr80 | Burmese ASR training dataset | ASR | BUR | https://openslr.org/80/ | | +| catslu | CATSLU-MAPS | SLU | CMN | https://sites.google.com/view/catslu/home | | | chime4 | The 4th CHiME Speech Separation and Recognition Challenge | ASR/Multichannel ASR | ENG | http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/ | | -| clarity21 | The First Clarity Enhancement Challenge CEC1 | SE | ENG | https://claritychallenge.github.io/clarity_CEC1_doc/ | | +| clarity21 | The First Clarity Enhancement Challenge CEC1 | SE | ENG | https://claritychallenge.github.io/clarity_CEC1_doc/ | | | cmu_indic | CMU INDIC | TTS | 7 languages | http://festvox.org/cmu_indic/ | | | commonvoice | The Mozilla Common Voice | ASR | 13 languages | https://voice.mozilla.org/datasets | | -| conferencingspeech21 | Far-field Multi-channel Speech Enhancement Challenge for Video Conferencing (ConferencingSpeech 2021) | SE | ENG, CMN | https://tea-lab.qq.com/conferencingspeech-2021 | | +| conferencingspeech21 | Far-field Multi-channel Speech Enhancement Challenge for Video Conferencing (ConferencingSpeech 2021) | SE | ENG, CMN | https://tea-lab.qq.com/conferencingspeech-2021 | | +| covost2 | Multilingual speech-to-text translation corpus from Common Voice | ST | lang pairs from 22 | https://github.com/facebookresearch/covost | | | csj | Corpus of Spontaneous Japanese | ASR | JPN | https://pj.ninjal.ac.jp/corpus_center/csj/en/ | | | csmsc | Chinese Standard Mandarin Speech Copus | TTS | CMN | https://www.data-baker.com/open_source.html | | | css10 | CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages | TTS | 10 langauges | https://github.com/Kyubyong/css10 | | | dirha_wsj | Distant-speech Interaction for Robust Home Applications | Multichannel ASR | ENG | https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj | | -| dns_ins20 | Deep Noise Suppression Challenge – INTERSPEECH 2020 | SE | 7 languages + singing | https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2020/ | | -| dns_icassp21 | Deep Noise Suppression Challenge – ICASSP 2021 | SE | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-icassp-2021/ | | -| dns_ins21 | Deep Noise Suppression Challenge – INTERSPEECH 2021 | SE | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2021/ | | +| dns_ins20 | Deep Noise Suppression Challenge – INTERSPEECH 2020 | SE | 7 languages +singing | https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2020/ | | +| dns_icassp21 | Deep Noise Suppression Challenge – ICASSP 2021 | SE | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-icassp-2021/ | | +| dns_ins21 | Deep Noise Suppression Challenge – INTERSPEECH 2021 | SE | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2021/ | | | dsing | Automatic Lyric Transcription from Karaoke Vocal Tracks (From DAMP Sing300x30x2) | ASR (ALT) | ENG singing | https://github.com/groadabike/Kaldi-Dsing-task | | | fisher_callhome_spanish | Fisher and CALLHOME Spanish--English Speech Translation | ASR/ST | SPA->ENG | https://catalog.ldc.upenn.edu/LDC2014T23 | | | fsc | Fluent Speech Commands Dataset | SLU | ENG | https://fluent.ai/fluent-speech-commands-a-dataset-for-spoken-language-understanding-research/ | | @@ -60,15 +61,15 @@ See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2 | ljspeech | The LJ Speech Dataset | TTS | ENG | https://keithito.com/LJ-Speech-Dataset/ | | | lrs3 | The Oxford-BBC Lip Reading Sentences 3 (LRS3) Dataset | ASR | ENG | https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs3.html | | | lrs2 | The Oxford-BBC Lip Reading Sentences 2 (LRS2) Dataset | Lipreading/ASR | ENG | https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html | | -| microsoft_speech | Microsoft Speech Corpus (Indian languages) | ASR | 3 languages | https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e | | +| microsoft_speech | Microsoft Speech Corpus (Indian languages) | ASR | 3 languages | https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e | | | mini_an4 | Mini version of CMU AN4 database for the integration test | ASR/TTS/SE | ENG | http://www.speech.cs.cmu.edu/databases/an4/ | | | mini_librispeech | Mini version of Librispeech corpus | DIAR | ENG | https://openslr.org/31/ | | -| ml_openslr63 | Crowdsourced high-quality Malayalam multi-speaker speech data | ASR | MAL | https://openslr.org/63/ | | +| ml_openslr63 | Crowdsourced high-quality Malayalam multi-speaker speech data | ASR | MAL | https://openslr.org/63/ | | | mls | MLS (A large multilingual corpus derived from LibriVox audiobooks) | ASR | 8 languages | http://www.openslr.org/94/ | | | mr_openslr64 | OpenSLR Marathi Corpus | ASR | MAR | http://www.openslr.org/64/ | | | ms_indic_is18 | Microsoft Speech Corpus (Indian languages) | ASR | 3 langs: TEL TAM GUJ | https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e | | | nsc | National Speech Corpus | ASR | ENG-SG | https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus | | -| open_li52 | Corpus combination with 52 languages(Commonvocie + voxforge) | Multilingual ASR | 52 languages | | | +| open_li52 | Corpus combination with 52 languages(Commonvocie + voxforge) | Multilingual ASR | 52 languages | | | | polyphone_swiss_french | Swiss French Polyphone corpus | ASR | FRA | http://catalog.elra.info/en-us/repository/browse/ELRA-S0030_02 | | | primewords_chinese | Primewords Chinese Corpus Set 1 | ASR | CMN | https://www.openslr.org/47/ | | | puebla_nahuatl | Highland Puebla Nahuatl corpus (endangered language in central Mexico) | ASR | HPN | https://www.openslr.org/92/ | | @@ -87,12 +88,12 @@ See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2 | su_openslr36 | Sundanese | ASR | SUN | http://www.openslr.org/36 | | | swbd | Switchboard Corpus for 2-channel Conversational Telephone Speech (300h) | ASR | ENG | https://catalog.ldc.upenn.edu/LDC97S62 | | | swbd_da | NXT Switchboard Annotations | SLU | ENG | https://catalog.ldc.upenn.edu/LDC2009T26 | | -| swbd_sentiment | Speech Sentiment Annotations | SLU | ENG | https://catalog.ldc.upenn.edu/LDC2020T14 | | +| swbd_sentiment | Speech Sentiment Annotations | SLU | ENG | https://catalog.ldc.upenn.edu/LDC2020T14 | | | tedlium2 | TED-LIUM corpus release 2 | ASR | ENG | https://www.openslr.org/19/, http://www.lrec-conf.org/proceedings/lrec2014/pdf/1104_Paper.pdf | | | thchs30 | A Free Chinese Speech Corpus Released by CSLT@Tsinghua University | TTS | CMN | https://www.openslr.org/18/ | | | timit | TIMIT Acoustic-Phonetic Continuous Speech Corpus | ASR | ENG | https://catalog.ldc.upenn.edu/LDC93S1 | | | totonac | Highland Totonac corpus (endangered language in central Mexico) | ASR | TOS | http://www.openslr.org/107/ | | -| tsukuyomi | つくよみちゃんコーパス | TTS | JPN | https://tyc.rei-yumesaki.net/material/corpus | | +| tsukuyomi | つくよみちゃんコーパス | TTS | JPN | https://tyc.rei-yumesaki.net/material/corpus | | | vctk | English Multi-speaker Corpus for CSTR Voice Cloning Toolkit | ASR/TTS | ENG | http://www.udialogue.org/download/cstr-vctk-corpus.html | | | vctk_noisyreverb | Noisy reverberant speech database (48kHz) | SE | ENG | https://datashare.ed.ac.uk/handle/10283/2826 | | | vivos | VIVOS (Vietnamese corpus for ASR) | ASR | VIE | https://ailab.hcmus.edu.vn/vivos/ | | diff --git a/egs2/covost2/asr1/asr.sh b/egs2/covost2/asr1/asr.sh new file mode 120000 index 00000000000..60b05122cfd --- /dev/null +++ b/egs2/covost2/asr1/asr.sh @@ -0,0 +1 @@ +../../TEMPLATE/asr1/asr.sh \ No newline at end of file diff --git a/egs2/covost2/asr1/cmd.sh b/egs2/covost2/asr1/cmd.sh new file mode 100644 index 00000000000..2aae6919fef --- /dev/null +++ b/egs2/covost2/asr1/cmd.sh @@ -0,0 +1,110 @@ +# ====== About run.pl, queue.pl, slurm.pl, and ssh.pl ====== +# Usage: .pl [options] JOB=1: +# e.g. +# run.pl --mem 4G JOB=1:10 echo.JOB.log echo JOB +# +# Options: +# --time