I recently wrote my first web scraper using Python and Beautiful Soup: Build a Web Scraper With Python written by Martin Breuss. This web scraper gathers data from The Grad Cafe, a popular online forum where students self-report admissions decisions for graduate programs. For instance, a student may report a university they applied to, whether they were accepted, their grades, and test scores. The Grad Cafe is a rich data source because students report admissions decisions as they receive them providing an early view into graduate admissions trends. In this short project, I hope to answer the following questions
- Is the Grad Cafe a reliable data source? In other words, does is accurately capture admission statistics?
- If so, can we use Grad Cafe data to accurately classify who is accepted into economics PhD programs and whether they receive auspicious funding?