MTBenchEvaluator and MMLUEvaluator should be/have static methods #26

JamesKunstle · 2024-06-26T21:12:39Z

Evaluator objects shouldn't be reused- once we've evaluated a checkpoint or model, we want to save the score and move on to the next. This motivates a reasonable design change, implementing something like:

class MMLUEvaluator(Evaluator):

    def __init__(self):
        # optional empty initialization 
        ...

    def run(self):
       ...

    @staticmethod
    def run(self, model, tasks, few_shot, batch):
        ...

nathan-weinberg · 2024-06-30T20:15:25Z

@JamesKunstle if we're outputting the evaluation results and saving it to a file or variable in memory, why not reuse the evaluator class?

alinaryan · 2024-09-27T16:00:00Z

@JamesKunstle do you have a PR for this already?

nathan-weinberg added mmlu Pertains to MMLU mtbench Pertains to MTBench labels Jun 30, 2024

This was referenced Jul 1, 2024

[Epic] MT Bench #31

Closed

[Epic] MMLU #29

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MTBenchEvaluator and MMLUEvaluator should be/have static methods #26

MTBenchEvaluator and MMLUEvaluator should be/have static methods #26

JamesKunstle commented Jun 26, 2024

nathan-weinberg commented Jun 30, 2024

alinaryan commented Sep 27, 2024

MTBenchEvaluator and MMLUEvaluator should be/have static methods #26

MTBenchEvaluator and MMLUEvaluator should be/have static methods #26

Comments

JamesKunstle commented Jun 26, 2024

nathan-weinberg commented Jun 30, 2024

alinaryan commented Sep 27, 2024