-
Notifications
You must be signed in to change notification settings - Fork 36
Refactoring of Evaluation and adding of evaluate command #264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fast_llm/engine/training/config.py
Outdated
hint=FieldHint.feature, | ||
valid=skip_valid_if_none(check_field(Assert.gt, 0)), | ||
) | ||
class TrainingEvaluatorConfig(EvaluatorConfigBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should still inherit from IntervalConfig
so it's simpler and we don't have the redundant run_interval
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer composition over multiple inheritance here, as IntervalConfig
is a property of the evaluator wrapper (which TrainingEvaluatorConfig
is), and not of the new entity derived from EvaluatorConfigBase
. In my opinion, this makes the code much more readable.
However, multiple inheritance is already used in many places. So if you still prefer that approach after my explanation, I’m happy to refactor accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple inheritance is only needed because of the EvaluatorConfigBase
mixin, which is arguably not even necessary. I'd rather prioritize better usage (configs) over marginally simpler code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will use multiple inheritance but will retain EvaluatorConfigBase
so I don't need to rewrite EvaluatorRunner
then we will create a separate config for evaluate command in addition for it of accepting training config. We need to discuss what it should look like, so I’ve created an issue for it #285.
else 0 | ||
) | ||
|
||
def get_evaluator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unnecessary, we can just call evaluator.get_evaluator()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained in details #222 (comment) and #222 (comment) but shortly it is to maintain proper encapsulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no encapsulation needed though, TrainingEvaluatorConfig
is a fixed class with an evaluator :EvaluatorConfig
field, which dis dynamic but has a well-defined get_evaluator
method.
Encapsulation would be needed if we allowed for a more generic scenario where evaluator.get_evaluator
doesn't exist or has a different signature, ex. if we allowed for a more generic evaluator
or a generalized TrainingEvaluatorConfig
that doesn't have an evaluator
. I don't really see this happening anytime soon...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing I can't really work around here is that I need to return TrainingEvaluator
and not evaluator.get_evaluator()
. This is because TrainingEvaluator
is responsible for handling whether evaluators should or should not run during training. Neither the concrete evaluators nor the EvaluatorRunner
are aware of this—they simply execute.
|
||
|
||
@config_class() | ||
class EvaluatorConfigBase(Config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained in details #222 (comment) and #222 (comment) but shortly it is to maintain proper encapsulation
I will move it in the #282, so we can close this faster. |
|
||
# @pytest.mark.extra_slow | ||
@requires_cuda | ||
def test_loss_validation_vs_inference(model_and_tokenizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test is worth it, it's kind of trivial.
✨ Description
Creates Evaluator abstraction so additional evaluators beyond Loss can be added.
Adds an
evaluate
command that accepts the same training config and enables evaluation on the last checkpoint.Includes some fixes.
Example: specifying multiple LossEvaluators
🔍 Type of change
Select all that apply:
📝 Changes
✅ Checklist
Make sure the following tasks are completed before submitting the PR:
General
Testing