Hidden-Markov-Model-for-NLP
In this study, twitter products review was chosen as the dataset where people tweets their emotion, on product brands, as negative or positive emotion. The dataset was collected from kaggle.com and formatted in a .csv file containing tweets and respective emotions. There were 3548 tweets in text format and different feelings in the tweets column. For the analysis purpose, 3370 tweets had chosen as the training data, and the rest of the 178 tweets were kept for testing purposes. In the training period, the tweets data needed to be pre-processed and ready for model development. For the HMM model development, the dataset needed to be formatted as the model input, where the hidden state and observed state were required to be calculated. For the training data hidden state found for the positive state was 22 and for the negative state was 21; on the contrary, the observed state was 62282 for the positive review and 11319 for the negative review.