Kurdish twitter data repository for Kurmanji and Sorani dialects
This dataset includes a total of 29011 Kurmanji and 29010 Sorani tweets.
- Each line includes content for a new tweet
- No repeated content, each text entry is unique
- User-id mentions and URLS are replaced by USER_ID and URL respectively
- Any new lines characters are removed; hence first rule