This project demonstrates a complete data science workflow using the France FCHI CAC 40 dataset. The workflow includes data cleaning, exploratory data analysis (EDA), model building, and visualization.
project-name/
├── LICENSE
├── README.md
├── requirements.txt
├── data/
│ ├── processed/
│ │ ├── CAC 40 Historical Data.csv
| | └── CAC 40 History Data.xlsx
│ └── raw/
│ ├── CAC 40 Historical Data - Raw.csv
| └── CAC 40 Historical Data.xslx
├── notebooks/
│ ├── 01_Data_Cleaning.ipynb
│ ├── 02_EDA.ipynb
│ ├── 03_Model_Building.ipynb
│ ├── 04_Visualisations.ipynb
│ ├── CAC 40 Historical Data.csv
│ └── CAC 40 Historical Data.xslx
├── results/
│ ├── FCHI_CAC_40_All_Plots.png
│ ├── FCHI_CAC_40_General_Plot.png
| └── FCHI_CAC_40_Seaborn_Pairplot.png
└── scripts/
├── 01_Data_Cleaning_with_functions.py
├── 01_Data_Cleaning.py
├── 02_EDA_functions.py
├── 02_EDA.py
├── 03_Model_Building_functions.py
├── 03_Model_Building.py
├── 04_Visualisations_functions.py
└── 04_Visualisations.py
-
Create Virtual Environment:
python -m venv .venv
-
Activate Virtual Environment:
- Windows:
.venv\Scripts\activate
- macOS and Linux:
source .venv/bin/activate
- Windows:
-
Install Dependencies:
pip install -r requirements.txt
-
Navigate to Project Directory:
cd path/to/project-name/
-
Run Scripts:
- Data Cleaning:
python scripts/01_data_cleaning.py
- EDA:
python scripts/02_EDA.py
- Model Building:
python scripts/03_model_building.py
- Visualization:
python scripts/04_visualisation.py
- Data Cleaning:
-
Run Jupyter Notebooks:
jupyter notebook
Open the desired notebook (e.g.,
01_Data_Cleaning.ipynb
,02_EDA.ipynb
, etc.).
Run the notebooks in the following order:
01_Data_Cleaning.ipynb
02_EDA.ipynb
03_Model_Building.ipynb
04_Visualisations.ipynb
- Python 3.6+
- pandas
- seaborn
- matplotlib
- scikit-learn