All data is from this Kaggle competition. Download and extract the files into the data/
directory.
Run this inside the setup.ipynb
notebook.
Start with the eda.ipynb
notebook. Along with some EDA charts, the main point here is to combine Kaggle's train and test data into a single dataset that will be split later into train/test.
Next is the prep.ipynb
notebook. This cleans and encodes the text for input into the model.
The use train.ipynb
to train the model.