Skip to content

This is about spam classification using HMM model in python language

Notifications You must be signed in to change notification settings

FantacherJOY/Hidden-Markov-Model-for-NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Hidden-Markov-Model-for-NLP

In this study, twitter products review was chosen as the dataset where people tweets their emotion, on product brands, as negative or positive emotion. The dataset was collected from kaggle.com and formatted in a .csv file containing tweets and respective emotions. There were 3548 tweets in text format and different feelings in the tweets column. For the analysis purpose, 3370 tweets had chosen as the training data, and the rest of the 178 tweets were kept for testing purposes. In the training period, the tweets data needed to be pre-processed and ready for model development. For the HMM model development, the dataset needed to be formatted as the model input, where the hidden state and observed state were required to be calculated. For the training data hidden state found for the positive state was 22 and for the negative state was 21; on the contrary, the observed state was 62282 for the positive review and 11319 for the negative review.

Releases

No releases published

Packages

No packages published

Languages