Evaluating the Impact of Number of Episodes on TV Show Ratings

Overview

This project investigates whether there is a correlation between the number of episodes in a TV show and its IMDb rating, providing insights into how show length might influence audience perception.

Contributors

Author	Contact
Kanaya Hendra	kanayalalityahendra@tilburguniversity.edu
Owen van Lith	o.m.vanlith@tilburguniversity.edu
Lam Nguyen	l.k.l.nguyen@tilburguniversity.edu
Pepijn Kars	p.kars@tilburguniversity.edu
Jason Ye	a.s.ye@tilburguniversity.edu

1. Introduction

1.1 Research Motivation

With this research, we aim to explore the factors influencing TV show ratings to provide valuable insights for content creators, streaming platforms, and marketers. Understanding these factors can enhance content strategies, optimize recommendations, and improve user experience. Specifically, this research seeks to determine whether viewers prefer binge-watching shorter series or are more engaged with longer ones, suggesting an ideal number of episodes for a TV show.

1.2 Relevance

This research is relevant because it allows filmmakers and marketers to understand the key factors influencing TV shows ratings, leading to a better benchmarking and decision-making in their marketing strategies. With this information, filmmakers can make better decisions in the number of episodes they are producing. Furthermore, IMDb can improve its recommendation system, offering more personalized movie recommendations, making it easier for users to find movies they will enjoy.

1.3 Research Question

Does the number of episodes significantly influence the ratings of TV shows?

2. Method and results

2.1 Data Source

To begin, we review all available datasets from IMDb's Non-Commercial Datasets to identify those that contain the necessary information for our research. Specifically, we focus on datasets that include TV show titles, identifiers, the number of episodes, and ratings.

2.2 Variables

Subsequently, we choose to work with the following variables:

Dataset	Variable	Description
title.episode	tconst	identifier of episode
	parentTconst	identifier of the parent TV Series
	seasonNumber	season number the episode belongs to
	episodeNumber	episode number of the tconst in the TV series
title.ratings	tconst	unique identifier of the title
	averageRating	weighted average of all the individual user ratings
	numVotes	number of votes the title has received
title.basics	tconst	unique identifier of the title
	titleType	the type/format of the title
	primaryTitle	the more popular title
	originalTitle	original title, in the original language
	isAdult	0: non-adult title; 1: adult title
	startYear	the release year of a title
	endYear	TV Series end year
	runtimeMinutes	primary runtime of the title, in minutes
	genres	up to three genres associated with the title

2.3 Research Method

To explore these relationships, regression analysis will be used as the primary research method. This approach is ideal for quantifying the relationship between a dependent variable (TV show ratings) and an independent variable (number of episodes). By applying this method, we can measure how changes in the number of episodes impact ratings and determine the strength of this effect. Regression is especially well-suited for this research question because it reveals both the strength and direction of the relationship between the number of episodes and TV show ratings. It also strengthens the analysis by considering other variables, ensuring that the link between episode count and ratings isn’t influenced by unrelated factors.

2.4 Results

In our basic model without any control variables, we can see that the number of episodes have a slightly negative effect on the average IMDb rating With a P-value smaller than significance level of 5%, we can conclude that the number of episodes has a negative effect. However this model is without any control variables, so we need to expand our model.

In our main model with control variables we can see that the coefficient of Number of episodes is negative. However this is not significant anymore, as the P-value for this variable is 0.772 which is larger than 0.05 If we look at our control variables we see that they all are significant: Popularity has significant positive effect on the average rating. A short runtime has a significant negative effect on the average rating. Old has a significant negative effect on the average rating. Many episodes has a negative effect on the average rating.

3. Repository overview

|-- LICENSE
|-- Metadata.md
|-- README.md
|-- SRC
|   |-- Analysis
|   |   |-- Data_Analysis.Rmd
|   |   |-- Data_analysis.R
|   |   `-- makefile
|   `-- Data preparation
|       |-- Data Prep.Rmd
|       |-- Data_Cleaning.R
|       |-- Data_Exploration.R
|       |-- Data_Loading.R
|       `-- makefile
|-- data
|   |-- episode.csv
|   |-- ratings.csv
|   `-- titles.csv
|-- output
`-- team-project-no-vs-code-team-9_dprep.Rproj

4. Running instructions

Download and install R and RStudio: https://tilburgsciencehub.com/topics/computer-setup/software-installation/rstudio/r/

Download and install Git:https://tilburgsciencehub.com/topics/automation/version-control/start-git/git/

Sign up on Github:https://github.com/

Install make:https://tilburgsciencehub.com/topics/automation/automation-tools/makefiles/make/

Access to the datasets at:

title episode:https://datasets.imdbws.com/title.episode.tsv.gz

title.ratings:https://datasets.imdbws.com/title.ratings.tsv.gz

title.basics:https://datasets.imdbws.com/title.basics.tsv.gz

Necessary libraries in R

library(readr)
library(dplyr)
library(ggplot2)
library(knitr)
library(stringr)
library(car)
library(data.table)
library(readxl)

Installing tinytex to knit the Rmarkdown

install.packages('tinytex')
tinytex::install_tinytex()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating the Impact of Number of Episodes on TV Show Ratings

Overview

Contributors

1. Introduction

1.1 Research Motivation

1.2 Relevance

1.3 Research Question

2. Method and results

2.1 Data Source

2.2 Variables

2.3 Research Method

2.4 Results

3. Repository overview

4. Running instructions

Necessary libraries in R

Installing tinytex to knit the Rmarkdown

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
SRC		SRC
LICENSE		LICENSE
Metadata.md		Metadata.md
README.md		README.md
makefile		makefile
team-project-no-vs-code-team-9_dprep.Rproj		team-project-no-vs-code-team-9_dprep.Rproj

License

course-dprep/effect-of-episode-count-on-IMDB-ratings

Folders and files

Latest commit

History

Repository files navigation

Evaluating the Impact of Number of Episodes on TV Show Ratings

Overview

Contributors

1. Introduction

1.1 Research Motivation

1.2 Relevance

1.3 Research Question

2. Method and results

2.1 Data Source

2.2 Variables

2.3 Research Method

2.4 Results

3. Repository overview

4. Running instructions

Necessary libraries in R

Installing tinytex to knit the Rmarkdown

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages