Workshop "Automatic Sampling and Analysis of YouTube Comments", GESIS 2023

Materials for the 2023 GESIS Training workshop "Automatic Sampling and Analysis of YouTube Comments"

Johannes Breuer (johannes.breuer@gesis.org, @MattEagle09); Annika Deubel (annika.deubel@cais-research.de, @anndeub); M. Rohangis Mohseni (Rohangis.Mohseni@tu-ilmenau.de, @romohseni)

Please link to the workshop GitHub repository

Workshop description

YouTube is the largest and most popular video platform on the internet. The producers and users of YouTube content generate huge amounts of data. These data are also of interest to researchers (in the social sciences as well as other disciplines) for studying different aspects of online media use and communication. Accessing and working with these data, however, can be challenging. In this workshop, we will first discuss the potential of YouTube data for research in the social sciences, and then introduce participants to different tools and methods for sampling and analyzing data from YouTube. We will then demonstrate and compare several tools for collecting YouTube data. Our focus for the main part of the workshop will be on using the R to collect data via the YouTube API, process, and analyze it. Regarding the type of data, we will focus on user comments but also will also (briefly) look into other YouTube data, such as video statistics and subtitles. For the comments, we will show how to clean/process them in R, how to deal with emojis, and how to do some basic forms of automated text analysis (e.g., word frequencies, sentiment analysis). While we believe that YouTube data has great potential for research in the social sciences (and other disciplines), we will also discuss the unique challenges and limitations of using this data.

Target group

The workshop is aimed at people who are interested in using YouTube data for their research.

Learning objectives

Participants will learn how they can use YouTube data for their research. They will get to know tools and methods for collecting YouTube data. By the end of the workshop, participants should be able to...

automatically collect YouTube data
process/clean it
do some basic (exploratory) analyses of user comments

Prerequisites

Participants should at least have some basic knowledge of R and, ideally, also the tidyverse. Basic R knowledge can, for example, be acquired through the swirl course "R Programming" (see https://swirlstats.com/) or the RStudio Primer "Programming basics", both of which are available for free. There also are many brief online introductions to the tidyverse, such as this blog post by Dominic Royé or this workshop by Olivier Gimenez.

For the exercises as well as for "coding along" with the slides, access to the YouTube API is required. Information on this can be found in the slides on the YouTube API Setup.

Timetable & content

Day 1

Time	Topic	Slides	Exercises	Solutions
09:00 - 10:00	Introduction	HTML, PDF	-	-
10:00 - 11:00	The YouTube API	HTML, PDF	HTML	HTML
11:00 - 11:15	Coffee Break	-	-	-
11:15 - 12:15	Tools for collecting YouTube data	HTML, PDF	-	-
12:15 - 13:15	Lunch Break	-	-	-
13:15 - 14:45	Collecting YouTube data with R	HTML, PDF	HTML	HTML
14:45 - 15:00	Coffee Break	-	-	-
15:00 - 16:30	Processing and cleaning user comments	HTML, PDF	HTML	HTML

Day 2

Time	Topic	Slides	Exercises	Solutions
09:00 - 10:30	Basic text analysis of user comments	HTML, PDF	HTML	HTML
10:30 - 10:45	Coffee Break	-	-	-
10:45 - 12:15	Sentiment analysis of user comments	HTML, PDF	HTML	HTML
12:15 - 13:15	Lunch Break	-	-	-
13:15 - 14:45	Excursus: Retrieving video subtitles	HTML, PDF	-	-
14:45 - 15:00	Coffee Break	-	-	-
15:00 - 16:30	Recap, outlook, practice	HTML, PDF	-	-

Acknowledgements

Parts of the content have been developed by Julian Kohne for a previous version of this workshop. The materials (slides, exercises, etc.) have been using the R packages xaringan, unilur, and woRkshoptools.

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
content		content
exercises		exercises
slides		slides
solutions		solutions
.gitignore		.gitignore
CITATION.cff		CITATION.cff
README.md		README.md
youtube-workshop-gesis-2023.Rproj		youtube-workshop-gesis-2023.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workshop "Automatic Sampling and Analysis of YouTube Comments", GESIS 2023

Workshop description

Target group

Learning objectives

Prerequisites

Timetable & content

Day 1

Day 2

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

jobreu/youtube-workshop-gesis-2023

Folders and files

Latest commit

History

Repository files navigation

Workshop "Automatic Sampling and Analysis of YouTube Comments", GESIS 2023

Workshop description

Target group

Learning objectives

Prerequisites

Timetable & content

Day 1

Day 2

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages