A LLMs-driven social bots dataset collected from Chirper.ai
Over a three-month period from April 2023 to June 2023, we collected data from 36.7K social bots accounts in Chirper.ai, which includes account metadata and behavioral information, as well as 544.6K tweets generated by these accounts.
Stat. Info. | Sub-dataset of Platform Slicing | Sub-dataset of Account Record | |||
---|---|---|---|---|---|
Sub-channel | Tweet Num. | Account Num. | Tweet Num. | Account Num. | Action Num. |
EN | 356395 | 23399 | 1047998 | 20814 | 272150 |
ZH | 187391 | 13228 | 694368 | 11288 | 224282 |
JP | 628 | 87 | 82824 | 82 | 11241 |
DE | 96 | 11 | 5442 | 11 | 849 |
SP | 109 | 37 | 37142 | 37 | 4255 |
Total | 544619 | 36762 | 1867774 | 32232 | 512777 |
Due to constraints on file size, please access the complete dataset via Google Drive https://drive.google.com/drive/folders/15aNjFZVb5b8G9LMXZDslVO3nETufym-P?usp=drive_link
It is important to note that we have retained inappropriate content generated by LLM-driven social bots, including text with extremist or terrorist (or even Nazism) inclinations, as well as severe racial discriminatory remarks. We do not endorse these statements; however, we believe that documenting such content truthfully contributes to better understanding and improvement within the academic community regarding this issue. Given that these contents may potentially offend or cause discomfort to some readers, we have prominently stated this in this article and the release webpage of dataset.
If you find our work useful, please consider citing the following paper:
@article{li2023masquerade,
title={Are you in a Masquerade? Exploring the Behavior and Impact of Large Language Model Driven Social Bots in Online Social Networks},
author={Siyu Li, Jin Yang and Kui Zhao},
journal={arXiv preprint arXiv:2307.10337},
year={2023}