For this project at Metis, I used supervised learning (classification), Amazon Web Services, PostgreSQL, Flask, and HTML/CSS to create a web app that consolidates data on housing throughout the city. Users can find information on complaints about the building and area, crime rates, as well as the 'Sketchy Landlord' index, created by a classification model trained on data where tenants ended up suing their landlords.
In this repo, I've uploaded my code, the data I used, and the presentation I gave at Metis on this project.
Blog post is currently a work in progress.
- All data is available on NYC Open Data. The datasets used in this project are up-to-date as of October 17th, 2018.
- data_cleaning.py - cleaning of NYC open data and PLUTO data
- all performed on AWS
- mvp_maddy_obrien_jones.ipynb - minimum viable product
- baseline.ipynb - baseline model built on DOB complaints only
- model_testing.ipynb - feature engineering and model testing
- final_modeling.ipynb - building XGBoost model and scoring on data
- flaskapp.py - Flask app to look up building information
- page.html - first page of Flask app (search bar)
- template2.html - second page of Flask app (building information)
- xgb.pkl - pickled model to predict risk of litigation
- predicting_housing_litigation.pdf - powerpoint
- screen_recording.mov - demonstration of Flask app
*work in progress