Skip to content

DishT/Machine_Learning_City

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

User Income Level Classification Using Twitter Data

Contributors: Baiyue Cao, Chun-Chieh Tsai, Isha Chaturvedi

Real time census data has the potential to generate timely insights for urban policy makers, allowing them to capture important urban issues such as population displacement and neighborhood change. This study, building on top of the 2015 paper “Studying user income through language, behavior and affect in social media” by Preotiuc-Pietro et al. will show how twitter data can be used to predict user income level while using random forest selected top 20 features. In our study, we trained a Gaussian Process, a Support Vector Machine and a Random Forest model for prediction, achieving 0.42 for highest 10 class income level prediction and 0.88 for highest 3 class income level prediction. In conclusion, this paper shows how using relatively few features we can predict twitter user income level, and it provides a road map for policy makers to use twitter data to generate real time insights. [Keywords: twitter, natural language processing, income prediction]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published