Skip to content

Latest commit

 

History

History
127 lines (106 loc) · 2.45 KB

README.md

File metadata and controls

127 lines (106 loc) · 2.45 KB

Twitter-Crawler

Crawling Twitter's reply and save their contents and likes

Dataset

Name type Description
keyword str Keyword of the tweet
likes int Number of likes
tweet str Content of the tweet

Usage

Top 20 keywords in 2021, each keyword has 5000 tweets

"COVID-19",
"Vaccine",
"Zoom",
"Bitcoin",
"Dogecoin",
"NFT",
"Elon Musk",
"Tesla",
"Amazon",
"iPhone 12",
"Remote work",
"TikTok",
"Instagram",
"Facebook",
"YouTube",
"Netflix",
"GameStop",
"Super Bowl",
"Olympics",
"Black Lives Matter"
"India vs England",
"Ukraine",
"Queen Elizabeth",
"World Cup",
"Jeffrey Dahmer",
"Johnny Depp",
"Will Smith",
"Weather",
"xvideo",
"porn",
"nba",
"Macdonald",

Name type Description
keyword str Keyword of the tweet
main_tweet str Content of the tweet
main_likes int Number of likes of the tweet
reply str Content of the reply
reply_likes int Number of likes of the reply
search_terms = [
    "COVID-19",
    "Vaccine",
    "Zoom",
    "Bitcoin",
    "Dogecoin",
    "NFT",
    "Elon Musk",
    "Tesla",
    "Amazon",
    "iPhone 12",
    "Remote work",
    "TikTok",
    "Instagram",
    "Facebook",
    "YouTube",
    "Netflix",
    "GameStop",
    "Super Bowl",
    "Olympics",
    "Black Lives Matter"
    "Ukraine",
    "Queen Elizabeth",
    "World Cup",
    "weather",
    "nba",
    "Macdonald",
    "K-pop",
    "music",
    "movie",
    "sport",
    "news",
    "science",
]

Other program

  • count.py : Count the number of tweets and replies
  • check.py : Check json file format
  • search.py : Count the number of likes of each reply

Method

Using Snscrape

Install Snscrape

pip3 install snscrape

Development version

pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git

Reference