You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An all-too-often ignored side of optimization is the initialization; there is a lot of research out there suggesting that for both convex (and even more so for non-convex) optimization problems, a large amount of work can be saved by initializing algorithms at clever starting values.
Currently we are initializing all algorithms with the 0 vector. Once the API (#11) is sorted out, we should have multiple options for how to initialize, including (but not limited to):
random Gaussian initializations
running some other, faster algorithm at a very low tolerance (ex: initialization Newton with the output of gradient descent set at a very low tolerance setting)
outputs of previous runs (will be built into a refit method, to be raised in a future issue)
more interesting but academically well-grounded ideas
As we discussed earlier, I think this is a really cool idea, and I'm glad to be part of the discussion.
As a novice to this (and for the purposes of furthering a discussion), do you know any good surveys of what the academically well-grounded things look like and/or some higher-level discussions of the benefits of Smart Initialization™?
you can exploit the close connection between W-OLS and Logistic Regression to infer things about variable addition / dropping, which is related to multiple refits (see the Logistic Regression chapter in Elements of Statistical Learning)
Ultimately, I think the biggest bang will come from smart initializations when refitting a model, but I'd like to include at least a little thought on initializations from scratch as well.
An all-too-often ignored side of optimization is the initialization; there is a lot of research out there suggesting that for both convex (and even more so for non-convex) optimization problems, a large amount of work can be saved by initializing algorithms at clever starting values.
Currently we are initializing all algorithms with the 0 vector. Once the API (#11) is sorted out, we should have multiple options for how to initialize, including (but not limited to):
refit
method, to be raised in a future issue)cc: @mpancia
The text was updated successfully, but these errors were encountered: