Text classification with Speech Recognition in unified pipeline
-
We used Wav2Vec for speech recognition. If you want to know how to finetune wav2vec, please see here(for korean, see here)
-
We used Electra(especially KoElectra as we worked on Korean dataset) for text classification. If you want to know how to finetune electra, please see here: for Korean
pip install -r requirements.txt
Assume you have both wav2vec and electra.
python main.py [-h] [--wav_dir WAV_DIR] [--stt_output_path STT_OUTPUT_PATH] [--output_path OUTPUT_PATH] [--wav2vec_checkpoint WAV2VEC_CHECKPOINT] [--electra_checkpoint ELECTRA_CHECKPOINT]
optional arguments:
-h, --help show this help message and exit
--wav_dir WAV_DIR
--stt_output_path STT_OUTPUT_PATH
--output_path OUTPUT_PATH
--wav2vec_checkpoint WAV2VEC_CHECKPOINT
--electra_checkpoint ELECTRA_CHECKPOIN
The STT results will be in --stt_output_path
. Final predicted output will be in --output_path
.