n3fit - `fit` in `fit.py` should be split into seperate functions #519

wilsonmr · 2019-07-24T13:34:29Z

As I said in #516 (comment) I think fit should actually be several functions:

I think we can all agree on some of these functions:

initialised seeds
loading data
loading positivity data
fitting

I personally also think that hyperoptimization and fitting should be two seperate functions given that they both produce quite different things, and that the n3fit runcards could then call whichever of these two actions was necessary, since they seem like mutual exclusive tasks. But probably there is more discussion to have on this point.

If all of these functions are in the same file, which is an n3fit module (as far as the App is concerned) then they will benefit from the reportengine wizardry of taking other provider functions as function arguments which get automatically handled by the resource builder

The text was updated successfully, but these errors were encountered:

wilsonmr · 2019-07-24T13:52:50Z

Copying a comment here from #516

@scarlehoff:

everything inside the if hyperopt basically defines what I'd put in the hyperopt function. If we split up the other things then all that would be left in the fit function if we kept the hyperopt in there would be
def fit(..):
if hyperopt:
do something which scans parameters
else:
produce a replica or set of replicas and save them
which I would say goes against the point of actions.

You can have a performfit action which takes as input a ModelTrainer instance and a hyperopt function that takes also an input a ModelTrainer instance. At that point the difference between performfit and hyperopt is simply whether storefit is called at the end or not.

If you want to break out hyperopt from performfit it should be done before, i.e., hyperopt provides a parameters dictionary, ModelTrainer is provided by somebody else and then performfit is called by hyperopt's fitting function receiving as input ModelTrainer and parameters. *

Yeah I will try to summarise in an issue - the only thing I would say is my view of how hyperoptimization should be a separate action could happen here

I would rather have a second PR with "make hyperoptimization into an action" because there are several things that should be carefully thought about in order to not lose generality and because that should not affect the functionality.

*Edit: it is more complicated than this if you want to avoid rerunning things as much as possible. As the thing with parallel replicas I would need to actually sit down for a while in order to have a clear idea of what would make myself happy.

I guess I see what you mean, I suppose just by looking at the code especially with where the break currently it seems like the main point of hyperopt would be to return params, whereas performfit should take params as an input. Either way this change probably comes under the class of a rather big restructure of the code and should be seperate to #516 as you say

scarlehoff · 2019-07-24T14:00:03Z

Yes, I am not happy about the break either :__

it seems like the main point of hyperopt would be to return params

I should then modify that part. The only relevant thing should be the json file with the trials. The params at the end is just for having a printout of whatever best model hyperopt thing it found, but in general the user and hyperopt will have different opinions.

wilsonmr · 2019-07-24T14:13:04Z

Sure but whether the relevant thing is the best result or the trials I don't think it's PDF replica/replicas.

Maybe then we could have a provider which instances ModelTrainer/s which I didn't list before, this then gets taken as input into the hyperopt provider which writes a json file or the provider which produces replicas?

scarlehoff · 2019-07-24T14:19:17Z

No, but the output of performfit is also not a PDF, it is just the fact of having trained. And the training must be the same for hyperopt/

Yes, the way to go would be to have a provider for ModelTrainer (let it be singular for now) and another provider for the parameters.
The hyperopt provider can sit on the middle and catch the output parameters and transform it.

Then performfit can maybe get a flag and call hyperopt.Fmin() only in some circumstances? (I don't know, and this is the part I don't have a clear idea in mind about)

Then at the end the storefit part will also take the ModelTrainer and save the PDF. In the case of hyper-optimization we won't have any storefit because the trials.json fie has already been written by hyperopt.

wilsonmr · 2019-07-24T14:36:02Z

No, but the output of performfit is also not a PDF, it is just the fact of having trained. And the training must be the same for hyperopt/

ah, but then perhaps performfit shouldn't really be a provider. the training part is just ModelTrainer.hyperparametizable right? So really the providers (storefit and hyperoptimizer) both take (ModelTrainerinstance, params) and either directly call hyperparametizable or loop over it with the hyperopt.Fmin() or am I still not understanding something?

Zaharid · 2021-03-26T13:09:28Z

Is this still relevant?

wilsonmr · 2021-03-26T13:15:51Z

no

wilsonmr assigned wilsonmr, scarlehoff and Zaharid Jul 24, 2019

scarlehoff added the n3fit-RE label Jul 24, 2019

scarlehoff mentioned this issue Aug 17, 2019

n3fit - Refactor of hyperopt #516

Merged

scarlehoff added the n3fit Issues and PRs related to n3fit label Aug 29, 2019

wilsonmr mentioned this issue Jan 29, 2020

Reproducing pseudodata replicas in n3fit #639

Closed

siranipour mentioned this issue Feb 13, 2020

Action to return pseudodata in n3fit #650

Merged

wilsonmr closed this as completed Mar 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

n3fit - `fit` in `fit.py` should be split into seperate functions #519

n3fit - `fit` in `fit.py` should be split into seperate functions #519

wilsonmr commented Jul 24, 2019

wilsonmr commented Jul 24, 2019

scarlehoff commented Jul 24, 2019

wilsonmr commented Jul 24, 2019

scarlehoff commented Jul 24, 2019

wilsonmr commented Jul 24, 2019 •

edited

Loading

Zaharid commented Mar 26, 2021

wilsonmr commented Mar 26, 2021

n3fit - fit in fit.py should be split into seperate functions #519

n3fit - fit in fit.py should be split into seperate functions #519

Comments

wilsonmr commented Jul 24, 2019

wilsonmr commented Jul 24, 2019

scarlehoff commented Jul 24, 2019

wilsonmr commented Jul 24, 2019

scarlehoff commented Jul 24, 2019

wilsonmr commented Jul 24, 2019 • edited Loading

Zaharid commented Mar 26, 2021

wilsonmr commented Mar 26, 2021

n3fit - `fit` in `fit.py` should be split into seperate functions #519

n3fit - `fit` in `fit.py` should be split into seperate functions #519

wilsonmr commented Jul 24, 2019 •

edited

Loading