This Streamlit-based application provides a comprehensive data analysis tool for performing various classification and clustering algorithms on datasets. It supports CSV and Excel file uploads and delivers intuitive visualizations and performance metrics for machine learning models.
-
File Upload
- Upload your dataset in CSV or Excel format.
- The app processes and displays the dataset for further analysis.
-
Data Preprocessing
- Automatically handles numeric and categorical data.
- Standardizes the features for optimal performance.
-
Classification Algorithms
- k-NN (k-Nearest Neighbors) Classification
- Splits the dataset into training and testing sets.
- Standardizes features and reduces dimensions using PCA.
- Trains a k-NN classifier and provides accuracy, confusion matrix, and classification report.
- Visualizes the decision boundary in a 2D plot.
- Random Forest Classification
- Splits the dataset into training and testing sets.
- Standardizes features and reduces dimensions using PCA.
- Combines the output of multiple decision trees to reach a single result.
- Visualizes the decision boundary in a 2D plot.
- k-NN (k-Nearest Neighbors) Classification
-
Clustering Algorithms
- k-Means Clustering
- Cluster analysis and visualization.
- Computes and displays inertia and silhouette score.
- DBSCAN Clustering
- Generates a PCA plot for the clustering result
- k-Means Clustering
-
Results Visualization
- Confusion matrix and classification reports for classification models.
- 2D decision boundary visualization using PCA.
- Interactive and static plots using Matplotlib and Seaborn.
- Python 3.12
- Streamlit for the web app framework.
- Pandas for data manipulation.
- Numpy for numerical computations.
- Matplotlib and Seaborn for plotting and visualizations.
- scikit-learn for machine learning algorithms.
git clone https://github.com/yourusername/data-analysis-app.git
cd data-analysis-app
- Install the required packages using the following command:
pip install -r requirements.txt
- Run the Streamlit app:
streamlit run data_app.py
- Open your browser and navigate to http://localhost:8501 to access the app.
- Make sure docker is installed and clone
git clone https://github.com/yourusername/data-analysis-app.git
cd data-analysis-app
- Run:
docker-compose up --build
- Open your browser and navigate to http://localhost:8501 to access the app.
data-analysis-app/
│
├── img/ # Ionian University logo and favicon
|
├── pages/
│ ├── Home_Page.py # Home page for file upload
│ ├── 2D Visualization # 2D Visualization page
│ ├── Clustering_Algorithms.py # Clustering algorithms page
│ ├── Classification_Algorithms.py # Classification algorithms page
│ ├── Comparison.py # Comparison page
│ └── Information.py # Information page
│
├── Dockerfile
├── docker-compose.yml
├── data_app.py # Main entry point for the Streamlit app
├── requirements.txt # List of required packages
└── README.md # This readme file
Streamlit for providing an easy-to-use web application framework.
The developers of Pandas, Numpy, Matplotlib, Seaborn, and scikit-learn for their invaluable libraries.