AISHELL-4

This project is associated with the recently-released AIHSHELL-4 dataset for speech enhancement, separation, recognition and speaker diarization in conference scenario. The project, served as baseline, is divided into five parts, named data_preparation, front_end, asr and sd. The Speaker Independent (SI) task only evaluates the ability of front end (FE) and ASR models, while the Speaker Dependent (SD) task evaluates the joint ability of speaker diarization, front end and ASR models. The goal of this project is to simplify the training and evaluation procedure and make it easy and flexible for researchers to carry out experiments and verify neural network based methods.

Setup

git clone https://github.com/felixfuyihui/AISHELL-4.git
pip install -r requirements.txt

Introduction

Data Preparation: Prepare the training and evaluation data.
Front End: Train and evaluate the front end model.
ASR: Train and evaluate the asr model.
Speaker Diarization: Generate the speaker diarization results.
Evaluation: Evaluate the results of models above and generate the CERs for Speaker Independent and Speaker Dependent tasks respectively.

General steps

Generate training data for fe and asr model and evaluation data for Speaker Independent task.
Do speaker diarization to generate rttm which includes vad and speaker diarization information.
Generate evaluation data for Speaker Dependent task with the results from step 2.
Train FE and ASR model respectively.
Generate the FE results of evaluation data for Speaker Independent and Speaker Dependent tasks respectively.
Generate the ASR results of evaluation data for Speaker Independent and Speaker Dependent tasks respectively with the results from step 2 and 3 for No FE results.
Generate the ASR results of evaluation data for Speaker Independent and Speaker Dependent tasks respectively with the results from step 5 for FE results.
Generate CER results for Speaker Independent and Speaker Dependent tasks of (No) FE with the results from step 6 and 7 respectively.

Citation

If you use this challenge dataset and baseline system in a publication, please cite the following paper:

@article{fu2021aishell,
         title={AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario},
         author={Fu, Yihui and Cheng, Luyao and Lv, Shubo and Jv, Yukai and Kong, Yuxiang and Chen, Zhuo and Hu, Yanxin and Xie, Lei and Wu, Jian and Bu, Hui and Xin, Xu and Jun, Du and Jingdong Chen},
         year={2021},
         conference={Interspeech2021, Brno, Czech Republic, Aug 30 - Sept 3, 2021}
         }

The paper is available at https://arxiv.org/abs/2104.03603

Dataset is available at http://www.openslr.org/111/ and http://www.aishelltech.com/aishell_4

Contributors

Code license

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
asr		asr
data_preparation		data_preparation
eval		eval
front_end		front_end
sd		sd
LICENSE		LICENSE
README.md		README.md
fig_aishell.jpg		fig_aishell.jpg
fig_aslp.jpg		fig_aslp.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AISHELL-4

Setup

Introduction

General steps

Citation

Contributors

Code license

About

Releases

Packages

Languages

License

ckliao-nccu/AISHELL-4

Folders and files

Latest commit

History

Repository files navigation

AISHELL-4

Setup

Introduction

General steps

Citation

Contributors

Code license

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages