This is a group project done on webscraping BMW listings from Carvana.com and analyzing the dataset to predict prices. My contribution to the Python code consisted of the large "Big Scrape" code chunk, the variable count check, and the test block for testing individual features. My contribution to the R code is specifically missing value imputation using Hmisc, and random forest modeling.
To scrape multiple pages of listings into one centralized SQL table. The table contains 1 listing per row, including the model, year, physical features, mileage, number of imperfections, likes, saves, and other features.
Make: BMW
Scrape date range: mid-March 2021
ZIP code: 94103