Skip to content

Cberbullying detection on comments and messages shared over social media application dataset obtained from Kaggle

Notifications You must be signed in to change notification settings

Yashi11/CyberBullying-Detection

Repository files navigation

Detection of Cyberbullying using Bullying Features and Machine Learning: A Comparative Study

Cyberbullying detection on comments and messages shared over social media application dataset obtained from Kaggle is performed using the Python Notebook uploaded in the repository.

The dataset has the following fetures:

  • 2000 Datapoints
  • Datapoints contain unfiltered comments from Social platforms containing punctuation, emoticons and user tags etc.
  • Cross platform Dataset

Machine Learning Techniques applied for the purpose of bigamous classification of comments as "bullying" or "non-bullying" are:

  1. K-Nearest Neighbors
  2. Naive Bayes Classifier
  3. Logistic Regression
  4. Support Vector Classifier

Python libraries used in this project

numpy pandas sklearn NLTK

Roadmap

  • Dataset Collection: Collected Dataset from Kaggle

    https://www.kaggle.com/datasets/syedabbasraza/suspicious-communication-on-social-platforms

  • Conversion of CSV file to Dataframe using Pandas

  • Preprocessing of Comments using

    1. Conversion of text to lower-case
    2. Removal of Punctuations
    3. Removal of non-alphabetic words (words containing numerics/punctuations)
    4. Removal of stopwords (English)
    5. POS Tagging
    6. Lemmatization
    7. Removal of words with length less than or equal to 1
    
  • Application Of Algorithms

    1. K-Nearest Neighbors:
    https://medium.com/@draj0718/k-nearest-neighbor-knn-using-python-d0a6bb295e7d
    
    1. Naive Bayes Classifier:
    https://medium.com/@piyumipremathilake/na%C3%AFve-bayes-algorithm-3f5b78f32b1c
    
    1. Logistic Regression:
    https://www.ibm.com/topics/logistic-regression
    
    1. Support Vector Classifier:
     https://www.geeksforgeeks.org/classifying-data-using-support-vector-machinessvms-in-python/
    
  • Generation of Results

    In the form of a Metric Table as give below:

App Screenshot

Research Paper

Detection of Cyberbullying using Bullying Features and Machine Learning.pdf

Support

For project collaboration or any discussion, Email Me at yashasvi488@live.com.

About

Cberbullying detection on comments and messages shared over social media application dataset obtained from Kaggle

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published