Photo by Lukasz Szmigiel on Unsplash
This section is dedicated to tutorials about linear algebra (principle of mathematics for machine learning), machine learning algorithms (clustering, linear regression, classification, and so on), data science basics (data frame, data visualization, etc...), and principles of graph theory.
In this section, you will find the Jupiter Notebook for the tutorial I published in Medium. I suggest reading the tutorial and the companion tutorial code in the order provided in the table below. For practical reasons, I have divided some of the tutorials into more than one part (allowing me to concentrate in one of the tutorials on the theoretical part and the others on the programming). Tutorial dedicated only to the theory have not a linked Jupiter notebook containing the Python code used for the model and the graph. I wrote and test the code in Google Colab in order to make it reproducible.
I am progressively adding also some R tutorials, I decided to upload the R-scripts so you can test them. Check the table below where I list the Colab Notebooks, the R-scripts, and the companion articles.
Moreover, you may find here some Colab notebooks without a theoretical tutorial (yet). I decided to upload the code before I have finish to write the theoretical part (this would be indicated). I am convinced that the code alone is already beneficial. I would successively publish on Medium the written article (with details and comments on the code).
You can open a Github Issue for any request, comment or any issue you encounter.
- Tutorial List - The list of tutorials and corresponding code
- Utility - A list of functions and code you can use for your projects.
- Scripts - A list of scripts you can execute on your PC.
Tutorial | Notebook | Description |
---|---|---|
Data manipulation | notebook | Common data manipulation tasks and data issues - MEDIUM ARTICLE NOT YET PUBLISHED |
Pandas Cheatsheet | notebook | Introduction to Pandas library - MEDIUM ARTICLE NOT YET PUBLISHED |
Python Data Visualization | notebook | Introduction to data visualization with Python- MEDIUM ARTICLE NOT YET PUBLISHED |
Regular expression in Python | notebook | Regular expression in Python - MEDIUM ARTICLE NOT YET PUBLISHED |
Matrix operations for machine learning | notebook | Matrix operations for machine learning in Python - MEDIUM ARTICLE NOT YET PUBLISHED |
Matrix operations for machine learning - part 2 | notebook | Matrix operations for machine learning in Python, the second part - MEDIUM ARTICLE NOT YET PUBLISHED |
Tree classifiers | ---- | Introduction to tree classifiers, theory and math explained simple - MEDIUM ARTICLE NOT YET PUBLISHED |
Tree classifiers | notebook | Training of tree classifiers - MEDIUM ARTICLE NOT YET PUBLISHED |
Visualize decision tree | notebook | Visualization of decision tree - MEDIUM ARTICLE NOT YET PUBLISHED |
Train and visualize decision tree in R | R-script | Plot and visualize a decision tree in R - MEDIUM ARTICLE NOT YET PUBLISHED |
Evaluation metrics for classification - part I | notebook | How to calculate, code, and interpret evaluation metrics for classification - MEDIUM ARTICLE NOT YET PUBLISHED |
Evaluation metrics for classification - part II | --- | Part II about imbalance dataset and multiclass classification - MEDIUM ARTICLE NOT YET PUBLISHED |
Linear Regression - OLS | notebook | Linear regression introduction, least square method - MEDIUM ARTICLE NOT YET PUBLISHED |
Evaluation metrics for regression | notebook | Evaluation metrics for regression - MEDIUM ARTICLE NOT YET PUBLISHED |
Train and visualize regression tree | notebook | Train, visualize regression decision tree in Python- MEDIUM ARTICLE NOT YET PUBLISHED |
Linear regression in R | R-script | Train and visualize a linear regression model in R- MEDIUM ARTICLE NOT YET PUBLISHED |
Introduction to Python iGraph | Notebook | A notebook to refresh the use of Python iGraph |
Introduction to R iGraph | Notebook | A notebook to refresh the use of Python iGraph |
Introduction to point processing | Jupiter Notebook | Whether you are doing medical image analysis or you use Photoshop, you are using point preprocessing |
Introduction to Thresholding | Jupiter Notebook | A simple but powerful system for segmenting images |
A practical guide to neighborhood image processing | Jupiter Notebook | Love thy neighbors: How the neighbors are influencing a pixel |
A practical guide to morphological image processing | Jupiter Notebook | simple but powerful operations to analyze images |
Dividi et Impera: A Practical Guide to BLOB Analysis and Extraction with Python | Jupiter Notebook | Simple yet powerful techniques to extract objects. |
Harnessing the power of colors in Python | Jupiter Notebook | Color images have more hidden information than you think |
Image Segmentation with Simple and Elegant Methods | Jupiter Notebook | Why the need for a deep learning model with hundreds of layers? Sometimes, there are simpler and faster models. |
A Guide to Geometric Transformation with Python | Jupiter Notebook | Why the need for Photoshop when you can have fun with Python |
Graph ML: A Gentle Introduction to Graphs | -- | A deep introduction to these mysterious creatures. |
Graph ML: fantastic graphs and where to find them | -- | Why to use a graph? which application? |
Graph ML: introduction to NetworkX | Jupiter Notebook | How to start with handle graph in Python using the most popular library |
Graph ML: Graph traversal algorithms in a nutshell | Jupiter Notebook | A quick glance at bread-first and depth-first search algorithms for graph machine learning |
Graph ML: Introduction to Python iGraph | Jupiter Notebook | Python iGraph is a wide-use library to handle graphs. how do start using it? why? |
Graph ML: How Do you Visualize a Large network? | Jupiter Notebook | Seeing is understanding: How to visualize large networks |
Back to General Index -- Back to local index
I am providing some useful functions and classes that can be ready to use. I am providing them as executable Python files that you can import and use. You find them in the utility folder.
Check in the utiliy folder the example of usages and the explanation about them. Each function is a document and you can access the provided documentation.
For example, if you want to use my regression_report function in Colab you can import it in this way:
import sys
import os
user = "SalvatoreRa"
repo = "tutorial"
src_dir = "machine%20learning/utility/"
pyfile = "regression_report.py" #here the name of the file py
url = f"https://raw.githubusercontent.com/{user}/{repo}/main/{src_dir}/{pyfile}"
!wget --no-cache --backups=1 {url}
#copy here the link of the file
py_file_location = "https://github.com/SalvatoreRa/tutorial/blob/main/machine%20learning/utility/regression_report.py"
sys.path.append(os.path.abspath(py_file_location))
#here the importing
from regression_report import regression_report
Or alternatively, you can use in this way in Colab:
wget.download('https://raw.githubusercontent.com/SalvatoreRa/tutorial/main/machine learning/utility/utils_NA.py')
!pip install wget
from utils import *
import torch
import seaborn as sns
#generate different type of NA
X_miss_mcar = produce_NA(df, p_miss=0.4, mecha="MCAR")
X_miss_mar = produce_NA(df, p_miss=0.4, mecha="MAR", p_obs=0.5)
X_miss_mnar = produce_NA(df, p_miss=0.4, mecha="MNAR", opt="logistic", p_obs=0.5)
X_miss_quant = produce_NA(df, p_miss=0.4, mecha="MNAR", opt="quantile", p_obs=0.5, q=0.3)
File | Description |
---|---|
Regression report | Print different regression metric (similar to classification report of scikit-learn) |
Upset plot | Plot an upset plot to visualize missing data and their distribution in the columns |
Random NA generation | Introduces random missing values into a dataset. |
Utils NA | a set of utils to generate and insert NA in your dataset |
DR_utils | a set of utils for dimensional reduction techniques |
Correlation_utils | a set of utils for correlation dimension |
Back to General Index -- Back to local index
Here you can find a list of scripts that have been used to generate images for the tutorials or that can be used to analyze data and models. You can easily adapt to your needs.
For example, if you want to use my MAR script in your pc you can simply execute it in this way:
python3 MAR.py
Or alternatively:
python3.8 MAR.py
File | Description |
---|---|
MAR | Loop to test different algorithms for MAR missing value imputation. The script is generating missing values, testing different imputation methods, and generating the plots |
MNAR | Loop to test different algorithms for MNAR missing value imputation. The script is generating missing values, testing different imputation methods, and generating the plots |
MCAR | Loop to test different algorithms for MCAR missing value imputation. The script is generating missing values, testing different imputation methods, and generating the plots |
Back to General Index -- Back to local index
This project is licensed under the MIT License
Comment or open an issue on Github