Text Message Sentiment Analyser

Preface

Objective(s)

In this highly-digitalised day and age, texting has become the preferred way of communication for the current generation. However, texting has indirectly impacted the art of communicating - through the negligence of emotion. Consequently, text messages can often be misinterpreted, depending on the perspectives of the perceiver and sender.

The main objective of this project is to utilise the knowledge we learnt in elementary data science and machine learning to build a simple application bsaed on the following key factors:

To predict the emotion of a text message at a reasonable accuracy.

To provide the predicted probability of each emotion from the given sentence to account for cases where a multitude of emotions are present.

In addition, we figured a few potential routes to take our simple application further into development in the future:

Sentimental analysis of customer review on online products.

Sentimental analysis of IMDB ratings of movies.

Online dating profile matching algorithm fine-tuning based on the general perception of emotion from a conversation.

Skills Learnt

Perform Exploratory Data Analysis on unstructured data (texts) using Word Cloud.

Concepts about Recall, Precision & F1-score.

Logistic Regression, Linear Support Vector Machine & Naive Bayes Algorithm implementation in Machine Learning.

Implementation of Cross-Validation Check.

Implementation of an application's graphical user interface using Streamlit.

Elementary Object-Oriented Programming during the standardization of functions & classes.

Introduction to documentation writing.

Collaboration using GitHub.

Dataset

Source of Dataset

https://www.kaggle.com/praveengovi/emotions-dataset-for-nlp by Praveen

Format of Dataset

text emotion

i didnt feel humiliated sadness

i can go from feeling so hopeless to so damned hopeful just from being around... sadness

im grabbing a minute to post i feel greedy wrong anger

i am ever feeling nostalgic about the fireplace i will know that it is still... love

Note: text and emotion are separated by a semi-colon ';'.

i didnt feel humiliated;sadness
i am feeling grouchy;anger
...

Contributors

Lee Juin (Alias: @Neo-Zenith)

Co-authored Text-Message Sentiment Analyser
Documentation writing for README & Libaries Information

Kassim bin Mohamad Malaysia (Alias: @kassimmalaysia)

Co-authored Text-Message Sentiment Analyser
Presentation slides & scripts writing

Lee Ci Hui (Alias: @perfectsquare123)

Co-authored Text-Message Sentiment Analyser
Application design

Default Libraries

The following libraries are used throughout the project.

Pandas

Numpy

Seaborn

NLTK

Word Cloud

Matplotlib

Scikit-Learn

Note: Word Cloud has not received any official support for Python 3.8x and above. Thus, we used Word Cloud unofficial as our library instead. For Python 3.7x and below, please refer to Word Cloud. However, do note that our project is ran and tested on Python 3.8x and above.

Custom Libraries

We have compiled a list of functions and classes which are useful during our project. These functions are repeatedly used within our project, and can be found in Libraries.

Please read Libaries Information for the details of the functions and classes found within our custom library.

Miscellaneous

Issues

[FIXED] Issue on Jupyter Notebook (Ipynb files) and Github

There appears to be a widespread issue ongoing on Github w.r.t the incorrect printing/inability to print outputs from Jupyter Notebook formatted files.

Replicable: Yes
Source of Issue: Most likely Github
Fixed: Yes
Comments: Please use an alternative IDE to inspect the main code sections. Visual Studio Code is known to be working properly.

Issue on the display of Jupyter Notebook (Ipynb files) on Github

In certain scenario, clicking into our Jupyter Notebook will not render the notebook completely, or there is a tiny scrollable box which displays the notebook itself. While it is possible to read the entire notebook this way, it is highly inconvenient and certain visualisation will not be seen in its entirety.

Replicable: Yes
Source of Issue: Most likely due to the large file size of our notebook.
Fixed: No
Comments: Please refresh the notebook if the aforementioned error occurs. Otherwise, please use an alternative IDE to inspect the main code sections. Visual Studio Code is known to be working properly.

Run-through

Overview

Our code section is divided into 3 main portion:

Data Preparation

Exploratory Data Analysis

Machine Learning

Data Preparation

In this section, we perform the necessary import of libraries, as well as our train dataset. We also performed simple analysis of our dataset to get a brief outlook of what kind of data we were dealing with.

Please refer to Text-Message Sentiment Analyser for the details of our source code.

Exploratory Data Analysis

In this section, we perform mainly more in-depth analysis of our dataset. From the analysis, we figured out that our dataset requires some cleaning. Thus, we have performed dataset cleaning which can be classified into the following 3 phases:

Lemmatization of words

Removal of HTML tags and attributes

Removal of stopwords

We are mainly using the NLTK library as our de-facto dataset cleaning library.

We are mainly using the Word Cloud as our main data visualisation library.

Please refer to Text-Message Sentiment Analyser under Exploratory Data Analysis for the details of our source code.

Machine Learning

In this section, we perform machine learning by using the following 3 models on our train dataset:

Logisitc Regression

Naive Bayes Algorithm

Linear Support Vector Machine

We proceeded to apply our trained models on the validation dataset, and obtain their respective Precision, Recall and F1-socre.

We further performed a repeated k-fold cross validation check on each model to determine the best model from the three.

Finally we apply the best model we chose on the test dataset.

Please refer to Text-Message Sentiment Analyser under Machine Learning for the details of our source code.

Acknowledgements

Special thanks to our Teaching Assistant, Ms. Song Nan, for providing some valuable feedbacks and suggestions throughout the project.

Reference

Below are some links that we have used as references throughout the project:

Name		Name	Last commit message	Last commit date
Latest commit History 159 Commits
App/model		App/model
datasets		datasets
lib		lib
.gitignore		.gitignore
Libraries Information.md		Libraries Information.md
Libraries.py		Libraries.py
README.md		README.md
SC14_Team04.pptx		SC14_Team04.pptx
Text-Message-Sentiment-Analyser.Ipynb		Text-Message-Sentiment-Analyser.Ipynb
emotion_classifier_pipe_lsv.pkl		emotion_classifier_pipe_lsv.pkl
record_data.db		record_data.db
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Message Sentiment Analyser

Preface

Objective(s)

Skills Learnt

Dataset

Source of Dataset

Format of Dataset

Contributors

Default Libraries

Custom Libraries

Miscellaneous

Issues

[FIXED] Issue on Jupyter Notebook (Ipynb files) and Github

Issue on the display of Jupyter Notebook (Ipynb files) on Github

Run-through

Overview

Data Preparation

Exploratory Data Analysis

Machine Learning

Acknowledgements

Reference

About

Releases

Packages

Contributors 3

Languages

text	emotion
i didnt feel humiliated	sadness
i can go from feeling so hopeless to so damned hopeful just from being around...	sadness
im grabbing a minute to post i feel greedy wrong	anger
i am ever feeling nostalgic about the fireplace i will know that it is still...	love

Neo-Zenith/text-message-sentiment-analyzer

Folders and files

Latest commit

History

Repository files navigation

Text Message Sentiment Analyser

Preface

Objective(s)

Skills Learnt

Dataset

Source of Dataset

Format of Dataset

Contributors

Default Libraries

Custom Libraries

Miscellaneous

Issues

[FIXED] Issue on Jupyter Notebook (Ipynb files) and Github

Issue on the display of Jupyter Notebook (Ipynb files) on Github

Run-through

Overview

Data Preparation

Exploratory Data Analysis

Machine Learning

Acknowledgements

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages