CS-F429-Disfluency-Analysis

This project aims at investigating various Natural Language Processing techniques for the task of Disfluency Detection.
Our approach involves attempting to look at it from the lens of a modified Named Entity Recognition problem, and involved the utilization of finetuned BERT as well as Bi-LSTM based Neural Networks to achieve the same.
The experiments have been performed on modified versions of the DisflQA corpus and Switchboard Corpus, annotated as per requirement.

Disfl-QA dataset obtained from: Gupta, A., Xu, J., Upadhyay, S., Yang, D., & Faruqui, M. (2021). Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering. Findings of ACL. https://doi.org/10.18653/v1/2021.findings-acl.293 Link to github: https://github.com/google-research-datasets/Disfl-QA

Switchboard Corpus obtained from: Godfrey, John J., and Edward Holliman. Switchboard-1 Release 2 LDC97S62. Web Download. Philadelphia: Linguistic Data Consortium, 1993. The data section in the repository provides only sample data.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Checkpoints		Checkpoints
Data		Data
Preprocessing Codes		Preprocessing Codes
.gitignore		.gitignore
Disflqa_code.ipynb		Disflqa_code.ipynb
LICENSE		LICENSE
README.md		README.md
bilstm-implementation.ipynb		bilstm-implementation.ipynb
swbd_code.ipynb		swbd_code.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS-F429-Disfluency-Analysis

About

Releases

Packages

Contributors 3

Languages

License

nikkiBot/CS-F429-Disfluency-Analysis

Folders and files

Latest commit

History

Repository files navigation

CS-F429-Disfluency-Analysis

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages