Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 859 Bytes

README.md

File metadata and controls

15 lines (9 loc) · 859 Bytes

Testing

Installation

Install conda environments deepseekenv, mammoth and lm-eval based on instructions in DeepSeek-Math, MAmmoTH, and lm-evaluation-harness

Evaluation

To evaluate the models, run scripts/eval.sh:

bash test/scripts/eval.sh $MODEL_PATH $OUTPUT_DIR

Replace $MODEL_PATH with the path to the directory where the model weights are stored, and replace $OUTPUT_DIR with the path to the directory where you wish the inference results would be saved. This script would also create a result directory under test/scripts where markdown files containing a table of all the results would be saved.