While there are several ways to measure the linguistic complexity of a text, I've yet to see a program that identifies a minimum age a reader needs to be for a particular book. Therefore, this project aims to meet this need.
Data Collection - Scraping Common Sense Media Book Reviews
- 0: Getting Search Pages
- 1: Getting Links to the Book Titles
- 2: Collect Every Book Review
- 3: Extract Details from Each Review
XGBoost - aka "King of Kaggle"
- 9: Baseline LSTM
- 10: LSTM Version II
- 11: LSTM Version III
- 12: LSTM Version IV
- 13: LSTM Version V
- 14: LSTM Version VI
Model | Train MAE | Test MAE |
---|---|---|
Naive Baseline | n/a | 3.27 |
Computer Vision | 2.69 | 2.50 |
LSTM Version II | 1.35 | 1.65 |
XGBoost Version III | 1.50 | 1.62 |
LSTM Version VI | 0.66 | 1.04 |
Finally, to watch a presenation of this project, please click below: