This project analyzes the 2024 US Presidential Debate transcripts to uncover key themes and sentiments expressed by the candidates: President Joe Biden and former President Donald Trump
The goal is to gain insights into the topics & concerns addressed and sentiments discussed during the debate. By analyzing the frequency of specific words and generating word clouds, this analysis aims to provide a visual representation of the candidates' focus areas and the overall tone of their discourse.
The debate transcripts were sourced from CNN, capturing the exchanges between the candidates moderated by anchors Jake Tapper and Dana Bash.
- Python: Programming language used for data processing and analysis.
- NLTK (Natural Language Toolkit): Library for natural language processing tasks.
- Matplotlib: Library for creating visualizations such as plots and word clouds.
- WordCloud: Library for generating word clouds from text data.
The debate transcript was cleaned to remove stopwords, punctuation, and irrelevant words to focus on meaningful content. Statements were segregated by speakers using identifiers in the transcript: Biden, Trump, Tapper, and Bash.
Performed word frequency analysis on each speaker's cleaned statements to identify the most frequently used words.
Generated word clouds for each speaker based on their word frequency analysis results. Word clouds visually represent the frequency of words, with larger words indicating higher frequency.
Visualized and interpreted the word clouds to highlight key topics discussed by each candidate and their overall communication emphasis.
This project provides valuable insights into the 2024 US Presidential Debate, illustrating the prominent themes and sentiments expressed by President Joe Biden, former President Donald Trump, co-anchors Jake Tapper and Dana Bash. The word clouds visually capture the focus areas and tone of the debate, encouraging viewers to reflect on the candidates' messages.