Saving my 30 days of NLP work in this repository
Day 1 - 5:
Worked on Kaggle challenge
https://www.kaggle.com/c/crowdflower-search-relevance/
Everything explained in detail in this blog:
https://gautigadu091.medium.com/an-end-to-end-case-study-on-crowd-flower-search-results-relevance-7229243f4d12
All the preprocessed data can be downloaded from this link
Notebooks description:
- EDA.ipynb -- Exploratory Data analysis
- Text preprocessing.ipynb -- Text cleaning - Preprocessing - Vectorization
- Performance_metrics.ipynb -- Why kappa score is suitable?, Implementing Kappa score manually in python
- Feature Extraction -- Advanced count feature extraction from the text.
- Modelling