TMDB-Analysis

In this project, we look at various statistics related to the dataset and ask some very intriguing questions on key traits. Here are the questions we try to answer with our script.

1. We first need to answer some general questions about the dataset such as:

a. Which movie earns the most and least profit?

b. Which movie had the greatest and least runtime?

c. Which movie had the greatest and least budget?

d. Which movie had the greatest and least revenue?

e. What is the average runtime of all movies?

f. In which year we had the most movies making profits? (profits of movies in each year)

2. We then move on to answer specific questions like similar characteristics of some most profitable movies such as:

a. Average duration of movies.

b. Average budget.

c. Average revenue.

d. Average profits.

e. Which director directed most films?

f. Which cast has appeared the most?

g. Which genre were more successful?

h. Which month released highest number of movies in all the years?

i. And which month made the most profit?

3. We also analyse some trends and relations among some traits like:

a. How have movie production trends varied over the years?

b. What are the top 20 highest grossing movies?

c. What are the top 20 most expensive movies?

d. How do budgets correlate with revenues? Do higher budget movies have higher revenue?

e. What run times are associated with each genre?

How to run

Python installation is required for this script to run. You can go to python.org to get python. You can also use Anaconda for this. Python 3.x is recommended for this.

After installing python, you need to install the packages. To do that, navigate into the project directory and open a command prompt and type this command pip install -r requirements.txt and this will install all the necessary packages. If you want, you can also create a virtual environment and install the packages in that environment itself. There are a number of ways this can be achieved and can be easily found online.

Now to actually run the script, in the command prompt type python data_analysis.py. However, it is highly recommended that you use a text editor or an IDE to run the script. Personally, I use PyCharm from Jetbrains. It is a very powerful IDE. You can also use Visual Studio, Visual Studio Code, Spyder(this is installed when you install Anaconda) or any other editor or IDE you like.

Sources referred

I have referred a lot of sources while working on this project. Some of them includes:

Stackoverflow
Documentation of packages used (pandas, matplotlib, seaborn)
Udacity course lessons

Issues

If you find any issues with the script or any general issue, kindly file them on Issues

Contributions

If you would like to contribute or want to make any changes, you can submit a Pull Request and I will make sure to follow up.

Licence

This repo is shared under Apache Licence 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
.idea		.idea
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Report.pdf		Report.pdf
data_analysis.py		data_analysis.py
data_clean.py		data_clean.py
requirements.txt		requirements.txt
tmdb-movies-original.csv		tmdb-movies-original.csv
tmdb-movies.csv		tmdb-movies.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TMDB-Analysis

1. We first need to answer some general questions about the dataset such as:

2. We then move on to answer specific questions like similar characteristics of some most profitable movies such as:

3. We also analyse some trends and relations among some traits like:

How to run

Sources referred

Issues

Contributions

Licence

About

Releases

Packages

Contributors 3

Languages

License

agpt8/TMDB-Analysis

Folders and files

Latest commit

History

Repository files navigation

TMDB-Analysis

1. We first need to answer some general questions about the dataset such as:

2. We then move on to answer specific questions like similar characteristics of some most profitable movies such as:

3. We also analyse some trends and relations among some traits like:

How to run

Sources referred

Issues

Contributions

Licence

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages