-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add "final_extra_valid_opt_filepath" to train_model.py #3883
Conversation
Also a way to configure saving outputs to json. Tested with a test. (Not super attached to names or to the default values I've got set. I tried using 'Optional[str]' for the cmdline arg, but the test was not happy with that...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to finish discussion
parlai/scripts/train_model.py
Outdated
help="A '.opt' file that is used for final eval. Useful for setting skip-generation to false. 'datatype' must be included as part of the opt.", | ||
) | ||
train.add_argument( | ||
'--write-log-as-json', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't the final evals stored in the trainstats file already? If not, my suggestion would be we include them in the trainstats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And not even make it an option. Just do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with all of @stephenroller's suggestions
parlai/scripts/train_model.py
Outdated
with PathManager.open( | ||
opt['model_file'] + log_suffix + '.' + datatype + ".json", 'a' | ||
) as f: | ||
json.dump(dict_report(report), f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd prefer keeping it with the rest of the .trainstats
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other valid/reports aren't saved to .trainstats right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every validation run, except the final one, is stored in trainstats.
ParlAI/parlai/scripts/train_model.py
Lines 474 to 481 in 9b5f8c6
json.dump( | |
{ | |
'parleys': self.parleys, | |
'train_time': self.train_time.time(), | |
'train_steps': self._train_steps, | |
'total_epochs': self._total_epochs, | |
'train_reports': self.train_reports, | |
'valid_reports': self.valid_reports, |
I'm humbly requesting you expand it to also include the final valid/test reports. (They can just be None during the intermediate stages)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, wording wasn't quite clear there prior. Anyway, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you have a bug for when model file isn't set (and we should be throwing trainstats into the void). Otherwise lgtm, thanks for making the changes.
I've been annoyed at having to kick off separate eval loops at the ends of trains, so here we are.
Also added a way to configure saving outputs to json (cause it's something that I've wanted for a while and I didn't want to add another return value in case scripts elsewhere were expecting only v_report and t_report).
Not super attached to names or to the default values I've got set. I tried using 'Optional[str]' for the cmdline arg, but the test was not happy with that...
Testing steps
Wrote a new test that exercises all of the new functionality.