"Toward Talent Scientist: Sharing and Learning Together" --- Jingwei Too
- This toolbox contains 7 widely used machine learning algorithms
- The
Demo_LR
andDemo_LASSO
provide the examples of how to use these methods on benchmark dataset
You may switch the algorithm by changing the lr
in from MLR.lr import jkfold
to other abbreviations
- If you wish to use linear regression ( LR ) then you may write
from MLR.lr import jkfold
- If you want to use decision tree ( DT ) then you may write
from MLR.dt import jkfold
feat
: feature vector matrix ( Instance x Features )label
: label matrix ( Instance x 1 )opts
: parameter settingsho
: ratio of testing data in hold-out validationkfold
: number of folds in k-fold cross-validation
mdl
: Machine learning model ( It contains several results )mse
: mean square errorr2
: R square score
There are three types of performance validations. These validation strategies are listed as following ( LR is adopted as an example ).
- Hold-out cross-validation
from MLR.lr import jho
- K-fold cross-validation
from MLR.lr import jkfold
- Leave-one-out cross-validation
from MLR.lr import jloo
import numpy as np
# change this to switch algorithm & types of validation (jho, jkfold, jloo)
from MLR.lr import jkfold
import matplotlib.pyplot as plt
from sklearn import datasets
# load data
X, Y = datasets.load_diabetes(return_X_y=True)
feat = X[:, np.newaxis, 2]
label = Y
# parameters
kfold = 10
opts = {'kfold':kfold}
# LR with k-fold
mdl = jkfold(feat, label, opts)
# overall mse
mse = mdl['mse']
# overall r2 score
r2 = mdl['r2']
import numpy as np
# change this to switch algorithm & types of validation (jho, jkfold, jloo)
from MLR.lasso import jho
import matplotlib.pyplot as plt
from sklearn import datasets
# load data
X, Y = datasets.load_diabetes(return_X_y=True)
feat = X[:, np.newaxis, 2]
label = Y
# parameters
ho = 0.3 # ratio of testing data
alpha = 1
opts = {'alpha':alpha, 'ho':ho}
# LR
mdl = jho(feat, label, opts)
# overall mse
mse = mdl['mse']
# overall r2 score
r2 = mdl['r2']
import numpy as np
# change this to switch algorithm & types of validation (jho, jkfold, jloo)
from MLR.dt import jloo
import matplotlib.pyplot as plt
from sklearn import datasets
# load data
X, Y = datasets.load_diabetes(return_X_y=True)
feat = X[:, np.newaxis, 2]
label = Y
# parameters
maxDepth = 5 # maximum depth of tree
opts = {'maxDepth':maxDepth}
# DT
mdl = jloo(feat, label, opts)
# overall mse
mse = mdl['mse']
# overall r2 score
r2 = mdl['r2']
- Python 3
- Numpy
- Pandas
- Scikit-learn
- Matplotlib
- Click on the name of algorithm to check the parameters
- Use the
opts
to set the specific parameters - If you do not set extra parameters then the algorithm will use default setting in here
No. | Abbreviation | Name |
---|---|---|
07 | en |
Elastic Net |
06 | nn |
Neural Network |
05 | svr |
Support Vector Regression |
04 | ridge |
Ridge Regression |
03 | lasso |
Lasso Regression |
02 | dt |
Decision Tree |
01 | lr |
Linear Regression |