Skip to content

Lin-Brain-Lab/video_sentiment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

🎬 Moving Sentiment Analysis on Video Clips

📌 Overview

This project explores multi-modal sentiment analysis on movie clips. We aim to understand and track emotions as they evolve over time in a video, using information from user comments, audio soundtrack, transcripts, and facial emotions.

The project is split into two major parts:

  • Model A: Predicts the overall emotion of a video clip based on YouTube comments.
  • Model B: Dynamically tracks changing emotions in a video using a combination of transcript, soundtrack, and image-based models.

🧠 Models

🧾 Model A: YouTube Comment-Based Emotion Classifier

This model predicts a video clip’s overall emotion using the top 50 English-language comments on the clip.


🎥 Model B: Moving Sentiment Analysis

A three-stream model that tracks sentiment over time in a video clip using three modalities:

1️⃣ Transcript Model

  • Goal: Predict emotion per line of dialogue
  • Model: SBERT powered One-Vs-Rest sentiment classification
  • Dataset:
  • Status:
    • 79% accuracy on current dataset
  • Input: Sentence-level dialogue
  • Output: One of six emotion labels per sentence

2️⃣ Soundtrack Model

  • Goal: Predict valence/arousal (V/A) values every 0.5s across 30s clip segments
  • Model: Random Forest Regression model
  • Dataset: DEAM / EmoMusic Dataset
  • Input: 15s–45s audio snippet
  • Output: V/A pair every 0.5s
  • Performance:
    • MSE Average: 0.0280
    • Status: Fully trained and functioning well

3️⃣ Image Model (Development Paused)

  • Components:
    • Bounding Box Detector
    • Facial Emotion Recognition
      • Datasets used: [FDDB] (https://paperswithcode.com/dataset/fddb)
      • Issues:
        • Emotion labels do not align with intended use
        • Labels are ambiguous and context-insensitive
        • No suitable replacement dataset available
      • Status: Development paused; focus shifted to audio and text

🧪 Dataset Summary

Modality Dataset(s) Used Notes
Comments Manually collected YouTube comments 7,500 labeled, top-liked only, English-only
Transcript DailyDialog, CMU-MOSEI Sentence-level emotion labels
Audio DEAM / EmoMusic V/A labels every 0.5s
Image LFPW, WIDERFace, FDDB Face detection working; emotion labels unsuitable

🌱 Next Steps

🔄 Model Development

We are initiating another round of manual data labeling to support further training and testing:

  • Labeling YouTube movie clips with overall emotions
  • Extracting and labeling soundtracks from these clips
  • Feeding labeled soundtracks into the existing Soundtrack Model for validation
  • Using the YouTube API to collect transcripts/dialogues for those clips
  • Testing the Transcript Model with new dialogue data
  • Working toward a combined Transcript + Soundtrack model that tracks clip-level emotion progression using both modalities

🌐 Website Development

We plan to build a two-page website with distinct purposes:

  • Page 1 – Research Portal:

    • Display a curated bank of pre-labeled movie clips
    • Allow researchers to view or download model predictions and emotion trajectories
    • Serve as a transparent hub for ongoing model refinement
    • Source code of final model included
  • Page 2 – Crowdsourced Survey:

    • Present selected video clips and predicted emotions
    • Let users provide feedback or label their own interpretations
    • Serve as a mechanism for continuous data collection and model improvement

About

generate emotional ratings based on video clips

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •