Sequence Labeling is very useful in many NLP applications, including conversational assistants like Alexa. I built two sequence label taggers using Conditional Random Fields and trained them on the Switchboard DMSL data. The difference between the two taggers being improved feature set. I saw a jump in performance from 71% to 78%, using bi-grams of text.
- Python 3
- Python CRFSuite
- Some knowledge on Conditional Random Fields
- Some research on Feature Extraction
To get a local copy up and running follow these simple example steps.
- Install Python. Detailed instructions for installation can be found here.
- Mac users, with homebrew can run the following command in their terminal.
brew install python3
- While I can give the command for conda users to install python, if you're running conda, you'd probably have it. You can check the version to make sure it is Python 3.
python --version
- Install pycrfsuite from here
- Clone this project using the following command.
git clone https://github.com/Narasimhag/SequenceLabelingWithCRF
- Download the data from the DAMSL link above and extract it to the project directory, created after running the clone command, typically named 'SequenceLabelingWithCRF'.
- Divide the data into training and testing sets.
- Run the baseline_tagger.py as follows.
python3 path/to/baseline_tagger.py /path/to/training/data /path/to/output/data /path/to/outputfile
- Run the advanced_tagger.py as follows.
python3 path/to/advanced_tagger.py /path/to/training/data /path/to/output/data /path/to/outputfile
- A small accuracy check code prints the accuracies to the terminal. You can observe advanced_tagger outperforms the baseline_tagger.
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (git checkout -b feature/AmazingFeature)
- Commit your Changes (git commit -m 'Add some AmazingFeature')
- Push to the Branch (git push origin feature/AmazingFeature)
- Open a Pull Request
If you have some criticism or want to say some nice things about the project, please feel free to tweet me. @raogundavarapu or email me at raonarasimha050@gmail.com