Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n3fit - Rationalized stopping-validation module #512

Merged
merged 25 commits into from
Oct 10, 2019

Conversation

scarlehoff
Copy link
Member

@scarlehoff scarlehoff commented Jul 15, 2019

As promised, this PR basically breaks down the huge mess that was Stat_info into several classes for stopping, validation and positivity, making things a bit (don't want to oversell it) clearer.

There are a few things to do yet, when I finish them I'll ask for reviews.

  • Documentation
  • Change the way the validation is checked at the end of every epoch

The point of this PR:

The main point of this changes was making the previously-named Statistics.py readable
(now called stopping.py) so anything that still looks convoluted/unreadable after the commits should be changed.

Example code for animation.py: animation.tar.gz

@scarlehoff
Copy link
Member Author

scarlehoff commented Jul 19, 2019

This is basically finished.
The only thing I am not happy 100% about is the animation.py code inside io/ which has two big problems and an issue.

Problems:

  • I am really bad at making images look nice, so the output is terrible.
  • It is a bit of a mess, related to the previous point and my lack of experience with doing animations and the fact that is basically a refactoring of the code of a student.

Issue:

  • The fact that is an standalone code with many calls to matplotlib. Maybe it makes more sense to rewrite it as a vp action (but I would rather remove the animation.py file of this PR and leave that for a future relaxing-side project, since I think having a nice moving picture is probably very low priority)

Note: I'll rebase master as soon as n3fit-ProductionVersion is merged.

@scarlehoff scarlehoff changed the base branch from n3fit-ProductionVersion to master July 19, 2019 15:23
@scarlehoff scarlehoff changed the base branch from master to n3fit-ProductionVersion July 19, 2019 15:26
@scarlehoff scarlehoff changed the base branch from n3fit-ProductionVersion to master July 19, 2019 15:26
@scarlehoff scarlehoff changed the title [WIP n3fit] Rationalized stopping-validation module [WIP] n3fit - Rationalized stopping-validation module Jul 25, 2019
@scarlehoff scarlehoff changed the title [WIP] n3fit - Rationalized stopping-validation module n3fit - Rationalized stopping-validation module Jul 29, 2019
@scarlehoff
Copy link
Member Author

@Zaharid have a look at this PR before #516 please, as this is a PR I actually want to merge.

#516 I want to merge but still needs a lot of discussion!

@scarlehoff scarlehoff added the n3fit Issues and PRs related to n3fit label Sep 2, 2019
…ning and validation regardless of possible extra stuff that might be added
@Zaharid
Copy link
Contributor

Zaharid commented Sep 18, 2019 via email

@scarlehoff
Copy link
Member Author

scarlehoff commented Sep 19, 2019

As said elsewhere, I would design this differently [...]
Instead I would write all relevant information in the history object.

I apologize because it is so obvious this was the right way to do it I feel bad for having just hacked in the history object thing. Although for some reason I was very happy with the code the first time.

Now, even if it is the right way, how to do it might be open to opinion. I have an idea of how to restructure the code in order to make both of us happy, but before I want to clarify several points in order not to make anyone re-review the code several times.

Similarly I'd model thing like Positivity as pure functions that operate on the history object.

What do you mean exactly by this? check_positivity is basically a pure function now.
Positivity as a class makes sense because we might want to have different threshold (or even different conditions) and we might actually change the conditions on the go. The flexibility is there and should not go. Maybe I misunderstood the idea.

There is nothing preventing us from having datasets starting with pos or total. I think it would be better to have some kind of structure that separates these things explicitly rather than interpreting names.

This will have to wait until there is an python object I can pass down that know whether it is a positivity set or not. As far as I can see I cannot do that in a clean manner (let me know if I am missing some is_positivity funtion) so I don't know how to solve this problem cleanly without adding things way outside of the scope of this PR.

Then saving a snaphot is spelled pickle.dump(fitstate, "file.pkl") and this method would not mutate the history but yield frames instead.

Saving snapshots will need to be outside of the scope of this PR because it requires quite a few changes at the backend level. This is something I have on mind because currently the save/load mechanism is crap.

There is FitState which contains all the things that change iteration by iteration

Only we cannot (and want not) save everything iteration by iteration. This is more a clarification than a comment.

@Zaharid
Copy link
Contributor

Zaharid commented Sep 19, 2019 via email

@scarlehoff
Copy link
Member Author

I agree with everything

Not sure I understand. Positivity is a different thing in vp already

The example of is_positivity was a very bad one. I really don't want isinstance(thing, PositivitySpec) because I don't feel it is an improvement.
In the following sense:
we want only the loss of the experiments to go in the chi2 as per #543 then the logical thing is to select those object which are of the type ExperimentSpec, but then the problem is that if we suddenly start fitting other objects (like datasets or other type of experiments) the code has potential for failing.
Similarly, if we check for which types we don't want (i.e., PositivitySpecs) if at some point we add something with a loss RegularizationSpec the code will fail for sure.*

Instead, the cleanest way to do it (in my mind) is to have all specs to have a count_me_in_the_total_chi2 which can default to False and only be true for experiments and datasets (for instance).

*you might say "this is not worse than what you currently have" but I don't see the point of changing it to something that is obviously flawed as well. Even if the flaw is somewhat smaller.

@Zaharid
Copy link
Contributor

Zaharid commented Sep 20, 2019 via email

@scarlehoff
Copy link
Member Author

To me it looks like the "correct" way is too easy for me to leave a hard-coded flawed approach in just because the one example we have now is unambiguos.
This will unevitably lead to less flexibility, which I'd rather avoid (when I am lucky enough to catch it, of course).

If you think it is not a good idea doing so right now at the level of vp, I'll do it at the level of reader.py which for my purposes are the same thing.

@Zaharid
Copy link
Contributor

Zaharid commented Sep 20, 2019 via email

@scarlehoff
Copy link
Member Author

We agree on that.

I'm just saying I don't want to do a type check (or ducktyping) four levels down when they are different things already at the top-level (be it vp or reader.py) and so what the Stopping receives can already know whether it will be counted or not.

Of course it has to be written manually, I was just wondering whether this information should already come from vp. If not I'll put it manually in the reader.

other (imagined) posibilities

For instance, if I want to fit a positivity set. Being real or not is beside the point, the design principle is full flexibility whenever possible so from two options* we chose the one that offers more fkexibility, even if the use case is not there.

*comparable options in the required amount of work

@Zaharid
Copy link
Contributor

Zaharid commented Sep 20, 2019

We agree on that.

So I guess I am not sure what we are discussing amount anymore :D.

I'm just saying I don't want to do a type check (or ducktyping) four levels down when they are different things already at the top-level (be it vp or reader.py) and so what the Stopping receives can already know whether it will be counted or not.

Indeed vp should provide you with a set of data and another set of positivity constraints, but then again it already does that, no? They are not mixed up anywhere AFAICT.

@scarlehoff
Copy link
Member Author

So I guess I am not sure what we are discussing amount anymore :D

I'm just trying (and failing) to get my point across. I'll first make this change (the way I have in my head) and then I'll do the fitstate.

The way I have in my head is not incompatible with yours and I think it is an improvement. If once it is implemented you like it it can be propagated up to vp in whatever way is convenient.

@scarlehoff
Copy link
Member Author

I have two different keys in the dictionaries:


'positivity' : bool - is this a positivity set?
'count_chi2' : should this be counted toward the chi2

I use positivity to decide which datasets (in the broad sense of the word) should be counted towards the positivity loss and thus affect the veto. Independently, I use count_chi2 to decide which datasets' loss should enter the chi2.

At the moment effectively count_chi2 = !positivity but done in this way we have extra flexibility and if in the future we can do more complicated things (for instance, penalties that affect the training or validation loss but are not positivity sets and are not part of the chi2).

Furthermore, adding an extra boolean flag to a dictionary is the least harmful of things.

I hope with the code what I failed to explain in several previous comments is much clearer (have a look just a parse_ndata at the top of stopping.py if you want to peek).

@scarlehoff
Copy link
Member Author

scarlehoff commented Sep 24, 2019

Ok, @Zaharid, have a look now.

There are a few things which can be improved but either they depend on #570 (i.e., the horrible _compute_validation_loss) or (and) they are at the moment outside of the scope of this PR (this includes the only comment I haven't marked as resolved about the pickle).

@Zaharid
Copy link
Contributor

Zaharid commented Sep 24, 2019

From a quick look, seems a lot better now! Will try to go over it ASAP.

Copy link
Contributor

@Zaharid Zaharid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might make some more comments (particularly around the handling of various dictionaries, which I need to study a bit more), but definitively think this is an improvement.

return tr_ndata_dict, vl_ndata_dict, pos_set


class FitState:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these go in a different file? I'd not search for them in something called stopping.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But they are only used in this file so if you are looking for them you already have stopping.py opened.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are only used inside this class.
Don't really care much, but I think it is cleaner in the same file under the rationale "if you are looking for these classes, you have already opened this file"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we had started from this, the stopping functionality would be inside or around it, i.e. it would belong to the fit rather than stopping. But fine.

# a terrible run, mark it as such
self.terrible = True

def __iter__(self):
Copy link
Contributor

@Zaharid Zaharid Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should work differently. Most importantly I know of no python object, either in the stdlib or in some famous library where next(iter(object)) mutates object, and I'd find that confusing if I encountered it. There is little point in yielding i because you can always call enumerate on the thing. Finally, is this used anywhere at all (in the proper case e.g. as a for loop)?

I'd either remove this or have it simply yield the states. A method called something like rewind(self, frame_index) -> new history might be useful though.

Copy link
Member Author

@scarlehoff scarlehoff Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you do nothing with the states, that's why I am just returning the step index.
Maybe it makes more sense to have a two steps thing where you would just do


for i in range(history_size):
    history.rewind(step = i)
    do whatever with your new reloaded thing

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit tricky because there is really no need to return anything, so the options are:

for i in history:
    # do thing

where the loop mutates history and you just receive a counter or,

for step in history.length:
    history.rewind(step)
    # do thing

so that the mutation of history is done explicitly. I'm changing to the second version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn it, I though the previous comment was never sent. I blame unimi's eduroam. I've commited this change now. It is uglier than before (to me) but leaves little to the imagination.

n3fit/src/n3fit/stopping.py Outdated Show resolved Hide resolved
n3fit/src/n3fit/stopping.py Outdated Show resolved Hide resolved
n3fit/src/n3fit/stopping.py Outdated Show resolved Hide resolved
n3fit/src/n3fit/stopping.py Outdated Show resolved Hide resolved
# Arguments:
- `validation_model`: a reference to a Validaton object
this is necessary at the moment in order to save the weights
- `save_each`: if given it will save a snapshot of the fit every `save_each` epochs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also save_each is confusing: save_weights_each or something like that.

Furthermore, is there a case for even having two lists? Couldn't we save the chi2s less frequently but at the same time as the weights?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The weights are heavy.
You can save the chi2 less frequently but you still need to save the chi2 every X epochs if you want to write logs down and the logs are not necessarily in sync with the best-weights.

You can set it so that they are only saved when debug:true but you would still be forced to have the two-lists implementation in...

@Zaharid
Copy link
Contributor

Zaharid commented Oct 2, 2019

Hello, this is @Zaharid's automated QA script. Please note it is highly experimental. I ran pylint on your changes and found some new issues.

On n3fit/src/n3fit/ModelTrainer.py, pylint has reported the following new issues:

  • Line 26: Too many instance attributes (23/7)
  • Line 134: Attribute '_model_file' defined outside init

On n3fit/src/n3fit/backends/keras_backend/MetaModel.py, pylint has reported the following new issues:

  • Line 72: Parameters differ from overridden 'fit' method
  • Line 91: Parameters differ from overridden 'evaluate' method
  • Line 103: TODO: make it into a dictionary of {'output_layer_name' : loss}
  • Line 114: Parameters differ from overridden 'compile' method

On n3fit/src/n3fit/fit.py, pylint has reported the following new issues:

  • Line 15: Too many statements (102/50)
  • Line 284: Cell variable layer_pdf defined in loop
  • Line 284: Cell variable integrator_input defined in loop

On n3fit/src/n3fit/io/writer.py, pylint has reported the following new issues:

  • Line 49: TODO decide how to fill up this in a sensible way
  • Line 177: Line too long (113/100)

On n3fit/src/n3fit/model_gen.py, pylint has reported the following new issues:

  • Line 21: Too many local variables (32/22)

On n3fit/src/n3fit/msr.py, pylint has reported the following new issues:

  • Line 181: Use % formatting in logging functions and pass the % parameters as arguments

On n3fit/src/n3fit/stopping.py, pylint has reported the following new issues:

  • Line 16: TODO for TF 2.0
  • Line 112: Too many instance attributes (8/7)
  • Line 222: Too many instance attributes (10/7)
  • Line 485: TODO: most of the functionality of this function is equal to that of parse_training

return tr_ndata_dict, vl_ndata_dict, pos_set


class FitState:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we had started from this, the stopping functionality would be inside or around it, i.e. it would belong to the fit rather than stopping. But fine.

@Zaharid
Copy link
Contributor

Zaharid commented Oct 9, 2019

Thanks!

I went over this and I am happy with it.

Do we have tests that show that it works fine, especially in things that are tricky such as rewinding the history?

@scarlehoff
Copy link
Member Author

Nothing systematic yet.
I'll try adding the regression test I mentioned last week next as a starting point.

@scarrazza scarrazza merged commit d86fa32 into master Oct 10, 2019
@scarrazza scarrazza deleted the n3fit-refactor-stopping branch April 22, 2020 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
n3fit Issues and PRs related to n3fit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants