Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualizing hyperparameter tuning results for arbitrary numbers of parameters #416

Closed
ablaom opened this issue Jan 3, 2020 · 11 comments
Closed

Comments

@ablaom
Copy link
Member

ablaom commented Jan 3, 2020

Suggestion of @baggepinnen, copied from #85:

I find it quite useful to visualize hyper-parameter tuning for more parameters than 2 as well. Simply plotting ranges vs function values for all parameters is a reasonable way of presenting the information. You can't determine interaction between parameters from this, but you can see overall trends for individual parameters and it quickly becomes apparent if one parameter is much more important than the others
example: https://github.com/baggepinnen/Hyperopt.jl/blob/master/figs/ho.svg

@ablaom
Copy link
Member Author

ablaom commented Jan 3, 2020

This sounds like a good suggestion to me.

@azev77
Copy link
Contributor

azev77 commented Jan 6, 2020

@ablaom @tlienart
While we're discussing tuning, it would be awesome if MLJ borrowed the best features from the Caret interface (& try to avoid their mistakes).

  1. Each model in Caret (it has 238) has a default grid for each hyper-parameter. In some cases the default grid is a function of the number of features.
    Example: glmnet (=elasticnet) has two hyper-parameters,
    alpha (default grid =.1, .55, 1)
    lambda default grid of 3 values

if you run:

m= train(Species ~ ., method = "glmnet", data = trainSet)
m 
plot(m)

You will automatically get scores (Accuracy/Kappa) for 9 combinations of alpha/lambda.
The plot will show you 3 curves, one per value of lambda.

  1. Users in Caret can of-course supply their own grids.

  2. Caret has a cool option called "tuneLength", where users can set the number of elements in the total grid.

m = train(Species ~ ., method = "glmnet", data = trainSet, tuneLength = 5)
m
plot(m)

This automatically generates a grid w/ 5 values of alpha, & 5 for lambda, giving a total grid w/ 25 elements.

I'm not sure if this plot option from Caret is what @baggepinnen had in mind, but I kinda like it.

I'd love to see an option like "tuneLength" in MLJ but you can also include a smarter option.
tuneLength = 20; produces 20 values FOR EACH hyper-parameter.
For a model w/ H hyper-parameters, the grid space will have 20^H points, which can be too big.

Perhaps you can include
tuneLength2 = 20; which produces a grid space w/ 20 points in total

@azev77
Copy link
Contributor

azev77 commented Jan 6, 2020

Here is the code

library(caret); data("iris"); set.seed(123);
my_index= createDataPartition(iris$Sepal.Length, p = 0.75, list = F)
trainSet= iris[my_index, ]; testSet= iris[-my_index, ];
####
#https://github.com/topepo/caret/blob/master/models/files/glmnet.R
getModelInfo("glmnet")
####
set.seed(123)
m= train(Species ~ .,method = "glmnet",data = trainSet )
m         #Accuracy/Kappa. alpha =.1/.55/1. lambda= 3 default values
plot(m)
#
set.seed(123)
m = train(Species ~ .,method = "glmnet",data = trainSet,tuneLength = 5)
m
plot(m)

@ablaom
Copy link
Member Author

ablaom commented Jan 9, 2020

@azev77 The key challenge for us would be setting up default grids for our existing models. Is possible to scrape a list of default grids for each caret model? This could be quite useful for MLJ devs.

(Although, in the case of nominal parameters, I am proposing we specify default ranges (ParamRange objects), which (roughly) specify the search space without specifying the resolution. These are bounded intervals, or, in the semi-bounded case, an upper/lower limit plus an "origin" and "unit". From these either grids or pdfs could be constructed, depending on further parameters appropriate to the particular tuning strategy - random, latin cube, Bayesian, and so forth).

@azev77
Copy link
Contributor

azev77 commented Feb 4, 2020

@ablaom @tlienart
I forgot to mention. Some other ML interfaces also have a timelimit option.
For example suppose I wanna train 45 regression models overnight I can set timelimit=60 minutes so it doesn't spend more than 60 minutes tuning hyper-parameters etc for a single model.

@ablaom
Copy link
Member Author

ablaom commented Feb 4, 2020

@azev77. Thanks for that. Suggestion noted.

As Tuning is iterative, control is to be externalised, for he plan for implementing any kind of control of any iterative model (including the TunedModel wrapper) will be though a common API. Other controls of this kind are stopping criterion and incremental serialisation of results.

See here: #139

@azev77
Copy link
Contributor

azev77 commented Feb 4, 2020

Btw, just to be clear I don't mean a timelimit just for tuning, I mean a timelimit for training in general.
Suppose I'm training 45 regression models overnight, I don't wanna let the computer spend more than timelimit=60 amount of time on any one of those models.

@baggepinnen
Copy link

Facebook AI has a new take on hyperparameter Visualization
https://ai.facebook.com/blog/hiplot-high-dimensional-interactive-plots-made-easy/

@azev77
Copy link
Contributor

azev77 commented Jun 8, 2020

Hey guys my conversation w/ @yalwan-iqvia about TreeParzen.jl got me thinking about HP optimization frameworks.
I wanted to tell you guys about Optuna (repo & paper) a new framework for HP optimization.

A nice comparison w/ Hyperopt shows what can be done for HP visualization:
https://neptune.ai/blog/optuna-vs-hyperopt

Here are a few snips:

image

image

A 3 minute clip: https://www.youtube.com/watch?v=-UeC4MR3PHM

It would really be amazing for MLJ to incorporate this!

@tlienart
Copy link
Collaborator

tlienart commented Jun 8, 2020

Yeah Optuna is cool and the team behind it is pretty solid.

This is a project by itself though: to do a Optuna.jl (with interface to MLJ). Maybe something worth announcing on discourse to see if there’s any takers

@ablaom
Copy link
Member Author

ablaom commented Jun 9, 2020

@azev77 Could you please re-post this suggestion at MLJTuning.jl? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants