This project provides an analysis of hospital costs based on different service codes. The main objectives are to identify the cheapest and costliest hospitals for each service code and to calculate the mean and standard deviation of the costs.
- Installation
- Usage
- Main Processes
- Contact
- Clone the repository:
git clone https://github.com/Rkpani05/Hospital_Cost_Analysis.git
- Navigate to the project directory:
cd path_to_directory
- Install the required packages using the requirements.txt file:
pip install -r requirements.txt
-
Ensure you have the necessary data file (path_to_your_file.xlsx) in the project directory.
-
Run the Jupyter notebook to perform the analysis:
jupyter notebook Hospital_Cost_Analysis.ipynb
- Follow the instructions in the notebook for a step-by-step analysis.
Data Loading: The data is loaded from an Excel file into a Pandas dataframe.
Data Exploration and Cleaning: The dataset is explored to understand its structure, and any missing values or duplicates are handled.
Outlier Detection: Outliers in the cost column are detected using the IQR method.
Ranking Hospitals: Hospitals are ranked based on their costs for each service code to identify the cheapest and costliest ones.
SQLite Database Operations: The data is stored in an SQLite database for further SQL-based analysis.
SQL Analysis: SQL queries are used to calculate the mean and standard deviation of the costs for each service code.
For any queries or feedback, please reach out to:
Name: Rohit Kumar Pani
Email: rk.pani2002@gmail.com