Install conda environments deepseekenv
, mammoth
and lm-eval
based on instructions in DeepSeek-Math, MAmmoTH, and lm-evaluation-harness
To evaluate the models, run scripts/eval.sh
:
bash test/scripts/eval.sh $MODEL_PATH $OUTPUT_DIR
Replace $MODEL_PATH
with the path to the directory where the model weights are stored, and replace $OUTPUT_DIR
with the path to the directory where you wish the inference results would be saved. This script would also create a result
directory under test/scripts
where markdown files containing a table of all the results would be saved.