Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 824 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 824 Bytes

30daysofNLP

Saving my 30 days of NLP work in this repository

Day 1 - 5:
Worked on Kaggle challenge
https://www.kaggle.com/c/crowdflower-search-relevance/

Everything explained in detail in this blog: https://gautigadu091.medium.com/an-end-to-end-case-study-on-crowd-flower-search-results-relevance-7229243f4d12
All the preprocessed data can be downloaded from this link

Notebooks description:

  1. EDA.ipynb -- Exploratory Data analysis
  2. Text preprocessing.ipynb -- Text cleaning - Preprocessing - Vectorization
  3. Performance_metrics.ipynb -- Why kappa score is suitable?, Implementing Kappa score manually in python
  4. Feature Extraction -- Advanced count feature extraction from the text.
  5. Modelling