Skip to content

Latest commit

 

History

History
22 lines (17 loc) · 1.49 KB

README.md

File metadata and controls

22 lines (17 loc) · 1.49 KB

Traversing the Trump Twitterverse: A social network analysis.

Repo and blog post are works in progress.
Inspired by Tim Martin's Promoting Positive Climate Change Conversations via Twitter

Question: Given a large sample of tweets referencing "@realDonaldTrump," can we generate a graph network such that distinct communities, moderators, and influencers are identifiable?
Data Collection: Queried the Twitter streaming API for any tweets including @realDonaldTrump. Stored in mongoDB.
Data Analysis: Pyspark, Spark SQL, networkX, python-louvain (community) package, LDA topic modeling.

Conclusions:

  • Communities are fairly clean: clear pro-Donald Trump and anti-Donald Trump communities, especially when Trump and his connections to other users are removed from the graph.
  • Influencers: news organizations
  • Moderators: also news organizations... and Ted Lieu?

Visualizations

Machines used:

  • Macbook Pro 13" (2015)
  • Amazon c5.9xlarge EC2 instance (36 cores, 68 GB RAM) with Spark (single node)

March - April 2018