Skip to content

ChihchengHsieh/SpamDetection-TweetsAndUserInfo

Repository files navigation

Instructions:

  1. Clone this folder and extract it.

  2. Download the [Dataset] and put it inside the folder

so the strcuture of this folder will become:

  1. Then in the bash to run this command:

  1. The Tkinter Interface will pop out and you can train and test the models on it.

Features in the dataset:

Tweets

text
numberOfHashtags_c
favorite_count
retweet_count
possibly_sensitive

User:

followers_count 
friends_count
default_profile 
default_profile_image
favourites_count
listed_count
statuses_count
verified

Model Structure

Update for the small dataset running test:

As we see from the above image, a samll check button (runningOnSamllDataset) has been added above the result box. If this check button is ticked. The program will run on a small dataset to check if the preprocessing and environment is good to go. The samll dataset size is 22903 only. So it should be ab proper size for both cpu and gpu users.

Results

Using Both Models

SSCL

GatedCNN

SelfAttn

Only textModel

SSCL

GatedCNN

SelfAttn

Only infoModel

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published