Skip to content

A Python project that analyzes the correlation between various movie features, such as budget and gross earnings, using data visualization and statistical techniques.

License

Notifications You must be signed in to change notification settings

RafaelKarcz/PythonCorrelation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Data Analysis Project

This document provides instructions for setting up the Python environment on a Linux system using Anaconda and Conda, and it outlines the steps for running a detailed data analysis and visualization of a movie dataset, which is available on Kaggle.

Environment Setup

  1. Creating a New Conda Environment:

    To create a new isolated environment for the project:

    conda create --name myproject python=3.12

    This command creates a new environment named myproject with Python 3.12.

  2. Activating the Environment:

    Activate the created environment with the following command:

    conda activate myproject

    This ensures that any Python operations or package installations are confined to this environment.

  3. Installing Jupyter Lab:

    Install Jupyter Lab to work with notebooks and code interactively:

    conda install -c conda-forge jupyterlab

    Jupyter Lab is installed from the Conda-Forge channel.

  4. Starting Jupyter Lab:

    To launch Jupyter Lab:

    jupyter lab

    This command opens Jupyter Lab in the default web browser.

  5. Additional Package Installations:

    Install additional packages required for data manipulation and visualization:

    conda install pandas matplotlib seaborn

Project Execution

The core of this project involves analyzing and visualizing data from the movies.csv file, which can be downloaded from Kaggle at the following link:

The Python script processes this data to:

  • Load and clean data: Handles missing data and corrects data types.
  • Visualize relationships: Creates scatter plots, strip plots, and correlation matrices.
  • Analyze trends: Identifies trends and high correlations within the movie industry data.

To run the analysis, navigate to the script in Jupyter Lab and execute the cells sequentially. Ensure the movies.csv file is correctly placed in your project directory.

Contributing

Contributions to this project are welcome. Please ensure to maintain the environment specifications and follow the coding standards used in this project.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A Python project that analyzes the correlation between various movie features, such as budget and gross earnings, using data visualization and statistical techniques.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published