Skip to content

raadbintareaf/Dataset_Top20Users_Tweets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Dataset_Top20Users_Tweets

For better experience, Please consider the (Raw View).

-This Dataset was gathered by crawling Twitter's REST API using the Python library tweepy 3. This dataset contains the tweets of the 20 most popular twitter users (with the most followers) whereby retweets are neglected. These accounts belong to public people, such as Katy Perry and Barack Obama, platforms, YouTube, Instagram, and television channels shows, e.g., CNN Breaking News and The Ellen Show.

-Consequently, the dataset contains a mix of relatively structured tweets, tweets written in a formal and informative manner, and completely unstructured tweets written in a colloquial style. Unfortunately, the geocoordinates were not available for these tweets.

-This Dataset has been used to generate reserach paper under title "Machine Learning Techniques for Anomalies Detection in Post Arrays".

-Crawled attributes are: Author (Twitter User), Content (Tweet), Date_Time, id (Twitter User ID), language (Tweet Langugage), Number_of_Likes, Number_of_Shares.

Overall: 52543 tweets of top 20 users in twitter

Screen_Name #Tweets Time span (in days)

TheEllenShow 3,147 662 jimmyfallon 3,123 1231 ArianaGrande 3,104 613 YouTube 3,077 411 KimKardashian 2,939 603 katyperry 2,924 1,598 selenagomez 2,913 2,266 rihanna 2,877 1,557 BarackObama 2,863 849 britneyspears 2,776 1,548 instagram 2,577 456 shakira 2,530 1,850 Cristiano 2,507 2,407 jtimberlake 2,478 2,491 ladygaga 2,329 894 Twitter 2,290 2,593 ddlovato 2,217 741 taylorswift13 2,029 2,091 justinbieber 2,000 664 cnnbrk 1,842 183

About

52543 tweets of top 20 users in twitter

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published