Skip to content

nikkiBot/CS-F429-Disfluency-Analysis

Repository files navigation

CS-F429-Disfluency-Analysis

  • This project aims at investigating various Natural Language Processing techniques for the task of Disfluency Detection.
  • Our approach involves attempting to look at it from the lens of a modified Named Entity Recognition problem, and involved the utilization of finetuned BERT as well as Bi-LSTM based Neural Networks to achieve the same.
  • The experiments have been performed on modified versions of the DisflQA corpus and Switchboard Corpus, annotated as per requirement.

Disfl-QA dataset obtained from: Gupta, A., Xu, J., Upadhyay, S., Yang, D., & Faruqui, M. (2021). Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering. Findings of ACL. https://doi.org/10.18653/v1/2021.findings-acl.293 Link to github: https://github.com/google-research-datasets/Disfl-QA

Switchboard Corpus obtained from: Godfrey, John J., and Edward Holliman. Switchboard-1 Release 2 LDC97S62. Web Download. Philadelphia: Linguistic Data Consortium, 1993. The data section in the repository provides only sample data.

About

NLP course project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published