Skip to content

mmmeeedddsss/sarcasm_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

Sarcasm Detection using Spark

Mert Tunç, Egemen Berk Galatalı


1.3 million reddit comments that is labeled as sarcastic or not is used as dataset to create a sarcasm classifier. Several methods for preprocessing, feature extraction and ml models are combined to get the best results. Implementation is done using Scala with Spark.

77% accurcacy on test set is taken with the best preprocessing - feature extraction - ml model selection - semi optimized hyper parameters combination. Please note that no other coloumns than comments itselves and labels are used on training or testing.

About

Ceng790 Term Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages