n3fit - Refactor of hyperopt #516

scarlehoff · 2019-07-23T14:54:55Z

It is basically a refactoring of the hyperopt code so it includes more documentation and it is easier to follow.

How to

The first step is to prepare a runcard with a hyperscan dictionary in it. As an example you can see the one in the runcards folder of n3fit: Basic_runcard.yml.

In its current form the hyperoptimization is activated by the flag --hyperopt at run time. It accepts as an argument a number, which is basically the number of trials to run. So, once you have your runcard prepare all you have to do is:

n3fit my_hyper_runcard.yml [replica_number] --hyperopt [number of trials]
Each trial consists on a different configuration of the parameter, following the scan defined in hyperscan.

Once the hyperopt scan has finished (it will take slightly less than time_of_a_single_fit * nuumber_of_trials) you will find a file in the nnfit directory:

my_hyper_runcard/nnfit/replica_1/trials.json

With this you can use the n3Hyperplot script to generate the plots:

n3Hyperplot my_hyper_runcard

Will generate a plot per replica. If you want to join all replicas as if it were one single longer run you can do

n3Hyperplot my_hyper_runcard --combine

If you want to play a bit with the results, maybe remove anything that uses Adadelta as the optimizer and has less than 500 epochs:

n3Hyperplot my_hyper_runcard --combine --filter "optimizer=Adadelta" "epochs>500"

The filter option will accept the operators !=, =, <, >

Note: I generally do the hyperscan with the replicas turned off, but send several runs with a different replica number, this way I get a few .json files with statistically different runs (all the seeding in n3fit
depends on the replica number)

scarlehoff · 2019-07-23T14:59:17Z

Right now the part regarding the generation of the trials should be (is) complete. The part that is missing is to make the script that generates the plot and plays along with the trails a bit better (right now is a functional mess).

@wilsonmr in order to implement new things to hyperopt, this PR (at this point, plus changes regarding the code structure but not functionality) should be the way to go.

As I say one of the TODO, eventually it might be good to do the migration to Report Engine of both this and the reading of the runcard of n3fit, but until that happens I think this is how the hyperopt implementation it will look for a while...

wilsonmr · 2019-07-24T11:46:53Z

I really need to get out the habit of writing such long comments

ok this is also relevant for #514

at the moment it seems to me that the fit function has several stages, which could possibly be seperated into seperate providers:

initialising seeds
loading and instancing of data <- relevant to to closure part
hyper optimizer
fit:
1. run fit
2. store fit

I think it would be good if things were split up a bit like this

def initialised_seeds(
    replica, fitting["trvlseed", "nnseed", "mcseed", "genrep"]
): -> seeds and gen_rep bool

def loaded_data(experiments, t0set): -> all_exp_infos

def loaded_positivity_data(posdatasets): -> pos_info

#might need to make this for single replica
def performfit(
    initialised_seeds,
    loaded_data,
    loaded_positivity_data,
    replica, replica_path,
    fitting["basis"],
    fitting["load", "loadfile"],
    fitting["parameters"],) -> result or results

#and this for single replica
def storefit(performfit): -> stored fit or fits # and compute arc legnths etc..

#and then do this
storefits = collect(storefit, replicas)

hyperopt_function(....): ...

It makes it clear that the experiments which is parse from runcard is clearly used in just one small section of the code, and so if I want to make some change -> like adding the possibility to use pseudodata instead then it's way easier to know where to do this and see exactly what it impacts -> in the case of closure data I know that regardless of using real or closure data, loaded data must return loaded data/pseudodata

I was also wondering if fitting is a bit redundant and just there for historical reasons -> perhaps these keys can just go in the top level of the runcard, I don't see any reason why the nnfit runcard should influence the n3fit runcard but perhaps I'm missing something

Right now this isn't a major concern but the only thing with splitting this up would be working out what to do about the backend imports within fit - whether to duplicate the imports in hyperopt/fit functions or perhaps have another provider, which somehow wrapped the imports within a class, schematically:

class backendimports:
    def __init__(self, *args):
        some logic based on args:
            import whatever
            self.ModelTrainer = whatever

def import_provider(backend):
    return backendimport(backend)

def fit(import_provider, *args):

AFAICT though there really is no point having these imports in the function anyway - unless you have another backend in development? In practise I wonder if at the level of fit.py you could just put the imports at the top and leave a comment saying if and when another backend appears this needs to be changed, just a thought..

scarlehoff · 2019-07-24T12:06:08Z

def initialised_seeds(
    replica, fitting["trvlseed", "nnseed", "mcseed", "genrep"]
): -> seeds and gen_rep bool

agree with this, although I don't think genrep needs to exist here.

def loaded_data(experiments, t0set): -> all_exp_infos

def loaded_positivity_data(posdatasets): -> pos_info

Also with this. The only thing to remember is that the mcessed and trvalseed need to enter this provider.

#and this for single replica
def storefit(performfit): -> stored fit or fits # and compute arc legnths etc..

well, I would say storefits should get the results from performfit and not return anything.
But in this part I would prefer to have a clear image in mind on how to do parallel replicas efficiently (and I still don't have it) before really saying anything.

hyperopt_function(....): ...

Here I dissagree. hyperopt_fuction should be part of performfit as a flag that will force the fit to run many times.

I was also wondering if fitting is a bit redundant and just there for historical reasons -> perhaps these keys can just go in the top level of the runcard, I don't see any reason why the nnfit runcard should influence the n3fit runcard but perhaps I'm missing something

100% historial reasons.
Or, more than historical reasons, because we wanted the runcard we were using with n3fit to be able to run with nnfit as well. But there is no reason to keep it outside that "debugging" stage to be fair.

Right now this isn't a major concern but the only thing with splitting this up would be working out what to do about the backend imports within fit - whether to duplicate the imports in hyperopt/fit functions or perhaps have another provider, which somehow wrapped the imports within a class, schematically:

The imports in fit.py are only two:

set_initial_state: only used for debug. It can even go up to n3fit.py
MetaModel: used to generate the PDF. This will only be used by storefit and it is not even necessary. Performfit can return already a model formed.

Now, all of this I would say it should be made into an issue because I believe it is relevant and should be done (and sooner rather than later even if it is not a top priority, but it would help with the modularization of the code) but it should go in a separate PR. All of this should be 100% independent of the hyperoptimization procedure.

wilsonmr · 2019-07-24T12:58:27Z

well, I would say storefits should get the results from performfit and not return anything.

yeah sorry I didn't mean for it to return anything I just was saying what the function was doing

hyperopt_function(....): ...

Here I dissagree. hyperopt_fuction should be part of performfit as a flag that will force the fit to run many times.

But the point is in n3fit at the moment you add actions_: peformfit to the runcard behind the scenes, now if hyperopt and performfit were two different actions as I am suggesting here, then the runcard would explicitly either be actions_: performfit or actions_: hyperopt and so the flag in the runcard would instead just turn into which action to run. I think your own comment even supports this

https://github.com/NNPDF/nnpdf/blob/e68c1bedd3285091a7462cf0fd92b2327ba32baa/n3fit/src/n3fit/fit.py#L224

everything inside the if hyperopt basically defines what I'd put in the hyperopt function. If we split up the other things then all that would be left in the fit function if we kept the hyperopt in there would be

def fit(..):
    if hyperopt:
        # do something which scans parameters
    else:
        # produce a replica or set of replicas and save them

which I would say goes against the point of actions.

Now, all of this I would say it should be made into an issue because I believe it is relevant and should be done (and sooner rather than later even if it is not a top priority, but it would help with the modularization of the code) but it should go in a separate PR. All of this should be 100% independent of the hyperoptimization procedure.

Yeah I will try to summarise in an issue - the only thing I would say is my view of how hyperoptimization should be a seperate action could happen here

scarlehoff · 2019-07-24T13:28:26Z

everything inside the if hyperopt basically defines what I'd put in the hyperopt function. If we split up the other things then all that would be left in the fit function if we kept the hyperopt in there would be

def fit(..):
if hyperopt:
# do something which scans parameters
else:
# produce a replica or set of replicas and save them

which I would say goes against the point of actions.

You can have a performfit action which takes as input a ModelTrainer instance and a hyperopt function that takes also an input a ModelTrainer instance. At that point the difference between performfit and hyperopt is simply whether storefit is called at the end or not.

If you want to break out hyperopt from performfit it should be done before, i.e., hyperopt provides a parameters dictionary, ModelTrainer is provided by somebody else and then performfit is called by hyperopt's fitting function receiving as input ModelTrainer and parameters. *

Yeah I will try to summarise in an issue - the only thing I would say is my view of how hyperoptimization should be a separate action could happen here

I would rather have a second PR with "make hyperoptimization into an action" because there are several things that should be carefully thought about in order to not lose generality and because that should not affect the functionality.

*Edit: it is more complicated than this if you want to avoid rerunning things as much as possible. As the thing with parallel replicas I would need to actually sit down for a while in order to have a clear idea of what would make myself happy.

wilsonmr · 2019-08-01T09:43:43Z

Hmm I think it wouldn't be so difficult to add a vaidphys argcheck that makes sure that if hyperopt is not None then hyperscan shouldn't be either.

Approximately (the error raise here looks weird when I tested it, but at least stops pretty early with a nicer error message rather than getting all the way to fit.py line 193 and then hyper_scan.py L126 and finding that it has a NoneType):

from reportengine.checks import make_argcheck, CheckError

@make_argcheck
def check_consistent_hyperscan_options(hyperopt, hyperscan):
    if hyperopt is not None and hyperscan is None:
        raise CheckError("hyperscan needs to be defined if performing hyperopt")

@check_consistent_hyperscan_options
def fit( ....)

Just means that if the runcard doesn't have hyperscan and it should then it'll raise an error much sooner - quality of life

wilsonmr · 2019-08-01T09:50:20Z

Now having said that I think you could add to the ConfigParser of n3fit something like:

def parse_hyperscan(self, hyperscan_dict: (dict, type(None)), *, hyperopt=None):
    if hyperopt is None:
        return None
    else:
        if hyperscan is None:
            raise ConfigError(..)
        #here check for compulsory entries in the dictionary (are there any?)

I think that hyper_scan.py L126-140 could easily go in a parser basically and if it was then you'd catch issues earlier or trivial things like if the dictionary doesn't exist and will let user know it's an issue with their runcard

wilsonmr · 2019-08-01T10:38:38Z

Should we add seaborn and hyperopt the the conda package meta with this PR so it can be used 'out of the box'?

n3fit/src/n3fit/hyper_optimization/plotting.py

wilsonmr · 2019-08-01T10:45:54Z

n3fit/src/n3fit/hyper_optimization/plotting.py

+            dictionaries = filter(filter_function, dictionaries)
+
+        # Now fill a pandas dataframe with the survivors
+        dataframe_raw = pd.DataFrame(dictionaries)


For the version of pandas I tested this on, dictionaries cannot be an iterable but needs to be explicitly cast:
pd.DataFrame(list(dictionaries))

But the output of filter should always be an iterator, which is an iterable... which version of pandas were you using?
Maybe the error was coming from somewhere else and that broke the filter?

I am getting something like:

In [13]: pd.DataFrame(iter(l)) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-13-b81493853882> in <module> ----> 1 pd.DataFrame(iter(l)) ~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy) 405 mgr = self._init_dict({}, index, columns, dtype=dtype) 406 elif isinstance(data, collections.Iterator): --> 407 raise TypeError("data argument can't be an iterator") 408 else: 409 try: TypeError: data argument can't be an iterator In [14]: pd.__version__ Out[14]: '0.23.4'

Interesting. They fixed that for 0.24

pandas-dev/pandas#21987

But it doesn't hurt to cast the filter to a list. I just wanted to make sure the error was not coming from something else.

Pandas is in my bad books for breaking interface way too often with things like this. But yeah, I see that it works again in 0.25.

scarlehoff · 2019-08-12T10:16:35Z

Should we add seaborn and hyperopt the the conda package meta with this PR so it can be used 'out of the box'?

It was already added in the previous PR. They both should be installed with the nnpdf package.

Hmm I think it wouldn't be so difficult to add a vaidphys argcheck that makes sure that if hyperopt is not None then hyperscan shouldn't be either.

Approximately (the error raise here looks weird when I tested it, but at least stops pretty early with a nicer error message rather than getting all the way to fit.py line 193 and then hyper_scan.py L126 and finding that it has a NoneType):
from reportengine.checks import make_argcheck, CheckError

@make_argcheck
def check_consistent_hyperscan_options(hyperopt, hyperscan):
    if hyperopt is not None and hyperscan is None:
        raise CheckError("hyperscan needs to be defined if performing hyperopt")

@check_consistent_hyperscan_options
def fit( ....)

I've added this error. I think it looks ok.

Zaharid · 2019-08-12T10:24:19Z

n3fit/src/n3fit/hyper_optimization/plotting.py

+
+        # Now filter out the ones we don't want
+        for filter_function in filter_functions:
+            dictionaries = filter(filter_function, dictionaries)


This maybe works, but it is potentially a headache (I had to run a couple of tests to make sure I understood what it's actually doing). Note that filter is a lazy iterator that doesn't consume the argument, so it = filter(func, it) saves a reference to the old iterable inside the return value of filter and reassigns it to be the filtered iterator. This is a bit inefficient in that you have to save all the intermediate iterators inside the recursive filter, but more importantly also more subtle than it looks like in that if you change something so that next(dictionaries) it will cause hard to understand bugs. Instead, you should do something like [item for d in dictionaries if all(f(item) for f in filter_functions)] (filter with the corresponding lambda would be fine, but I bet that slower).

But it looked so nice :(

But I see the problem.

Casting it to a list -per iteration- solves the problem? (i.e., like it is in the latest version)
Or would you rather have something along the lines of:

for filter in filters: new_dicts = [] for i in dictionaries: if filter(i): new_dict.append(i) dictionaries = new_dicts

?

(or the line you wrote, I am a bit dense today, I think I lost the ability to read past the 6th line of any comments)

I like the line I wrote better, because it is very clear that it requires all filters. Instead I was going to write that what you did is wrong because it looked like any at first sight (but then I realized you reassign dictionaries). It is a lot of intermediate lists again though.

scarlehoff · 2019-08-16T15:04:03Z

These last few commit (adding the --autofilter option) deal with the last* of the to-dos I wrote for this PR. Motivated by #527. This trimming algorithm is mostly empirical (I wanted something that would automatically do the same job as I would do by looking at the hyperopt plot and selecting which options to filter). I called it --autofilter.

I'm happy with it in the sense that it does what I want. I am unhappy in the sense that parameters are set 100% manually (that's why I was planning to not port it to this repository, but it was actually useful when choosing the best models so it can stay...)

*last in numerical order

Zaharid · 2019-08-17T12:54:03Z

Overall I think that this indeed needs to be a set of validphys actions (so it's good that we all agree). This is currently inventing an incompatible pipeline that would preclude us from doing fancy things (e.g. plots that correlate the hyperparameters with various estimators computed by vp such as the closure test ones), or even standard things like getting these plots into the comparefits reports. That said this code looks in a form that it shouldn't be too complicated to do just that.

Then we could have a small wrapper like comparefits or setupfit that just calls validphys under the hood with various command line options.

Zaharid · 2019-08-17T13:03:56Z

Also a lot of the processing code feels very similar to fitdata.py and looks like it could go there.

Zaharid · 2019-08-17T12:36:18Z

n3fit/src/n3fit/hyper_optimization/plotting.py

+    "loss": "loss",
+}
+
+NODES = (keywords["nodes"], 4)
 # 0 = normal scatter plot, 1 = violin, 2 = log


Could this be 'normal', 'scatter', 'violin' without the extra indirection? Or even an enum.Enum?

Zaharid · 2019-08-17T12:40:50Z

n3fit/src/n3fit/hyper_optimization/plotting.py

+
+        operator = regex_op.findall(filter_string)[0]
+
+        if operator == "=":


Do we really require this? Can we not have only ==?

Don't know, to me it is clear when the user write "optimizer=adam" they will want to check for equality so no need to make their life harder.

Zaharid · 2019-08-17T12:42:11Z

n3fit/src/n3fit/hyper_optimization/plotting.py

+            raise NotImplementedError("Filter string not valid, operator not recognized {0}".format(filter_string))
+
+        # This I know it is not ok:
+        if isinstance(val_check, str) and isinstance(filter_val, str):


Maybe we could use sympy.parse_expr for these things? @siranipour has been meaning to use it for the filters for a while...

I looked at simpy's sympify but I saw it was using eval under the hood anyway so didn't look further.
I also saw pyparsing as an option, having our own expression-parser module might be useful.

Zaharid · 2019-08-17T12:45:54Z

n3fit/src/n3fit/hyper_optimization/plotting.py

+    """
+    Receives a data_dict (a parsed trial) and a filter string, returns True if the trial passes the filter
+
+    filter string must have the format: key[]string


I find this would be clearer if it said:

filter string must have the format: key <operator> string

and also the regex knew how to discard whitespace (i.e. adding \w* to the patterns).

A space means an extra argument in the command line and this expressions are passed from the command line.

n3fit/src/n3fit/hyper_optimization/plotting.py

n3fit/src/n3fit/hyper_optimization/hyper_algorithm.py

Zaharid · 2019-08-17T13:36:25Z

n3fit/src/n3fit/hyper_optimization/hyper_algorithm.py

@@ -0,0 +1,332 @@
+"""
+    This module contains functions dedicated to process the json dictionaries


This kind of algorithmic code probably could use a few tests. Of course we might minimize the amount of it by using pandas wherever possible.

n3fit/src/n3fit/hyper_optimization/hyper_algorithm.py

Zaharid · 2019-08-17T13:39:28Z

n3fit/src/n3fit/hyper_optimization/hyper_algorithm.py

+    """
+    # If we don't have enough keys to produce n combinations, return empty
+    if len(key_info) < ncomb:
+        return []


Should this be an error instead?

No (or not in its current form) since we might want to loop over number of combinations. If we only have 2 keys but ask for more than 2 to combine, it simply does nothing and that's fine.

Eventually, when there is a robust form of this algorithm we might want to be restrictive with what it does, of course.

n3fit/src/n3fit/hyper_optimization/hyper_algorithm.py

Zaharid · 2019-08-17T15:08:15Z

Not sympyfy, parse_expr. That does what it should.

…

On Sat, 17 Aug 2019, 16:05 Juacrumar, ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In n3fit/src/n3fit/hyper_optimization/plotting.py <#516 (comment)>: > + filter_key = keywords[filter_key] + + val_check = data_dict[filter_key] + if val_check is None: # NaN means it does not apply + return True + + operator = regex_op.findall(filter_string)[0] + + if operator == "=": + operator = "==" + operators = ["!=", "==", ">", "<"] + if operator not in operators: + raise NotImplementedError("Filter string not valid, operator not recognized {0}".format(filter_string)) + + # This I know it is not ok: + if isinstance(val_check, str) and isinstance(filter_val, str): I looked at simpy's sympify but I saw it was using eval under the hood anyway so didn't look further. I also saw pyparsing as an option, having our own expression-parser module might be useful. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#516?email_source=notifications&email_token=ABLJWUVTKAASGPAGIS7VAQDQFAHSFA5CNFSM4IGFLUVKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCB3ROXI#discussion_r314949880>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABLJWUVTUGCJDITJI2LJ6UDQFAHSFANCNFSM4IGFLUVA> .

Zaharid · 2019-08-17T15:09:25Z

See how that went for bash! Besides, it would mean more issues if we ever go the sympy route.

…

On Sat, 17 Aug 2019, 15:58 Juacrumar, ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In n3fit/src/n3fit/hyper_optimization/plotting.py <#516 (comment)>: > + match = regex_not_op.findall(filter_string) + if len(match) < 2: + raise ValueError("Filter str is not correct: {0}".format(filter_string)) + filter_key = match[0] + filter_val = match[1] + + if filter_key in keywords.keys(): + filter_key = keywords[filter_key] + + val_check = data_dict[filter_key] + if val_check is None: # NaN means it does not apply + return True + + operator = regex_op.findall(filter_string)[0] + + if operator == "=": Don't know, to me it is clear when the user write "optimizer=adam" they will want to check for equality so no need to make their life harder. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#516?email_source=notifications&email_token=ABLJWUTR75D6XLYLGQWVZULQFAGZ5A5CNFSM4IGFLUVKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCB3RNEY#discussion_r314949707>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABLJWUR3JBIHM4YRKV5RQNTQFAGZ5ANCNFSM4IGFLUVA> .

scarlehoff · 2019-08-17T15:43:34Z

Overall I think that this indeed needs to be a set of validphys actions (so it's good that we all agree). This is currently inventing an incompatible pipeline that would preclude us from doing fancy things (e.g. plots that correlate the hyperparameters with various estimators computed by vp such as the closure test ones), or even standard things like getting these plots into the comparefits reports. That said this code looks in a form that it shouldn't be too complicated to do just that.

Then we could have a small wrapper like comparefits or setupfit that just calls validphys under the hood with various command line options.

Yeah, we can have a go at this in Milan in two weeks.

Not sympyfy, parse_expr. That does what it should.

Also uses eval. But I have nothing against sympy, I just didn't spend more time with it because my goal was just to see whether I could avoid eval. If we go the simpy route we can then change whatever needs to be changed.

Zaharid · 2019-08-17T16:06:29Z

Yeah, we can have a go at this in Milan in two weeks.

Not sympyfy, parse_expr. That does what it should. Also uses eval. But I have nothing against sympy, I just didn't spend more time with it because my goal was just to see whether I could avoid eval. If we go the simpy route we can then change whatever needs to be changed.

Sure, but it cleans the expression sufficiently before.

…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <#516?email_source=notifications&email_token=ABLJWUU4WW2EFZVWUSKKSNTQFAMCNA5CNFSM4IGFLUVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4QOAWI#issuecomment-522248281>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABLJWUTTMWF7JRYMSIUBW53QFAMCNANCNFSM4IGFLUVA> .

scarlehoff · 2019-08-17T16:36:19Z

Ok, so there are two things missing from this PR that I believe should be done before merging:

Justify a bit better the algorithm, maybe have a full discussion in Varenna about it
Instead of just plotting things, create a validphys hyperopt-report kind of thing

For the second point I would welcome a bit of help maybe in the meeting code here in Milano in two weeks, because I have very little exp with plotting things with validphys and I am sure building a few plots it's just a five minutes job for some of you.

Besides that, the hyperoptimization should be a different action as per #519. But I think that would be outside the scope of this PR.

Zaharid · 2019-08-17T17:30:37Z

I had a look at the hyperopt library itself and I have to say I am not a huge fan. Not of the documentation, not of the way the code looks, not of the deeply nested dictionaries, not of the dependency on mongodb, not of the amount of open issues on github. In the end it seems to me that we are getting a fancy for loop and a bunch of things we have to work around or that we could do better ourselves.

With that in mind, maybe it would make sense to try to isolate it as much as possible in one module, say hyperoptio and make it so it is easy to switch to a different library, e.g. the keras thing when it becomes a bit more tested.

ldd69 · 2019-08-18T11:24:33Z

Do we have a document with the details of the hyper-opt algorithm that is currently implemented in the code?

scarlehoff · 2019-08-18T15:14:57Z

Do we have a document with the details of the hyper-opt algorithm that is currently implemented in the code?

@ldd69
The hyperoptimization algorithm implemented in hyperopt is described here https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization

edit: to first approximation is just a random search of thousands of combinations.

Then there is the code for removing the unstable configurations. This code is WIP (I called it hyper_algorithm.py) and even what a "unstable configuration" is needs discussion before it becomes a true algorithm. As I said, right know it basically does what I do by eye only automatically.

With that in mind, maybe it would make sense to try to isolate it as much as possible in one module, say hyperoptio and make it so it is easy to switch to a different library, e.g. the keras thing when it becomes a bit more tested.

@Zaharid I agree with your feelings about hyperopt. This is the reasoning behind hyper_scan.py, most of the code is not hyperopt dependent and just changing a few lines would render it compatible with other libraries.

ldd69 · 2019-08-18T15:24:37Z

Excellent, thank you! I'll have a look. I'm trying to summarise my thinking in a set of notes.

scarlehoff · 2019-08-18T15:26:24Z

Excellent, thank you! I'll have a look. I'm trying to summarise my thinking in a set of notes.

I'll try to summarize how the code does the whole hyperoptimization thing (from a more pragmatic point of view) for Varenna in my presentation. I'll link it here as well once it's done.

…ation

…an_wrapper function

wilsonmr mentioned this pull request Jul 24, 2019

n3fit - fit in fit.py should be split into seperate functions #519

Closed

wilsonmr reviewed Aug 1, 2019

View reviewed changes

n3fit/src/n3fit/hyper_optimization/plotting.py Outdated Show resolved Hide resolved

wilsonmr reviewed Aug 1, 2019

View reviewed changes

wilsonmr mentioned this pull request Aug 1, 2019

add closure test to n3fit #514

Merged

Zaharid reviewed Aug 12, 2019

View reviewed changes

Zaharid reviewed Aug 17, 2019

View reviewed changes

This was referenced Aug 19, 2019

n3fit - Rationalized stopping-validation module #512

Merged

Compare optuna with hyperopt #534

Closed

scarlehoff added 27 commits April 1, 2020 15:27

made the tries-trimming algorithm usable

034c263

added a --keys_to_plot option to select which keys to plot

8f75195

addressed some of Zaharis comments, specially the ones about document…

c78a2b5

…ation

use a few more functions from pandas

3dc67f6

use pandas interval to discretize continuous variables

19b6fc5

absorb all hyperopt dependencies inside hyper_scan.py, add a hyper_sc…

fd2d563

…an_wrapper function

black and pylint

6be74db

fix for the TEST experiment

88b7b02

typo fix

401e4ea

only activate keys when the dictionary is present on the runcard

c72e7a7

remove usage of depecrated methods in pandas

6f5b5cb

remove create_testset

920c36e

add a reinitialize method to layers and models

322a565

add the skeleton for the kfolding

5237281

working prototype for kfolding

3083708

several changes to how the kfolding code behaves; added documentation

c92e821

added docs and runcard example

3279cc7

update the number of datapoint when recompiling

c678363

bugfix

e18b186

more bugfixes

d3dc4da

added kfolding strategy to the runcard

d716956

bugfix

810c368

save avg and std

203402e

updated hyperopt runcard with a working example

b7237bb

address comments

08e9260

remove extra k

7d09ab3

fix rebase problems

dc851c5

scarlehoff force-pushed the n3fit-refactor-hyperopt branch from a448fd2 to dc851c5 Compare April 1, 2020 13:52

scarlehoff merged commit b8aabcd into master Apr 1, 2020

scarrazza deleted the n3fit-refactor-hyperopt branch April 22, 2020 15:34


		operator = regex_op.findall(filter_string)[0]

		if operator == "=":

		@@ -0,0 +1,332 @@
		"""
		This module contains functions dedicated to process the json dictionaries

n3fit - Refactor of hyperopt #516

n3fit - Refactor of hyperopt #516

Conversation

scarlehoff commented Jul 23, 2019 • edited Loading

How to

scarlehoff commented Jul 23, 2019

wilsonmr commented Jul 24, 2019

scarlehoff commented Jul 24, 2019 • edited Loading

wilsonmr commented Jul 24, 2019

scarlehoff commented Jul 24, 2019 • edited Loading

wilsonmr commented Aug 1, 2019

wilsonmr commented Aug 1, 2019

wilsonmr commented Aug 1, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scarlehoff commented Aug 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scarlehoff Aug 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scarlehoff commented Aug 16, 2019 • edited Loading

Zaharid commented Aug 17, 2019

Zaharid commented Aug 17, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zaharid commented Aug 17, 2019 via email

Zaharid commented Aug 17, 2019 via email

scarlehoff commented Aug 17, 2019

Zaharid commented Aug 17, 2019 via email

scarlehoff commented Aug 17, 2019

Zaharid commented Aug 17, 2019

ldd69 commented Aug 18, 2019

scarlehoff commented Aug 18, 2019 • edited Loading

ldd69 commented Aug 18, 2019

scarlehoff commented Aug 18, 2019

scarlehoff commented Jul 23, 2019 •

edited

Loading

scarlehoff commented Jul 24, 2019 •

edited

Loading

scarlehoff commented Jul 24, 2019 •

edited

Loading

scarlehoff Aug 12, 2019 •

edited

Loading

scarlehoff commented Aug 16, 2019 •

edited

Loading

scarlehoff commented Aug 18, 2019 •

edited

Loading