-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot replicate the performance of distilled 1.5B model #194
Comments
Same with me. I'm also looking into the code. I could due to the temperature setting difference. The sampling temp is now defaulted to 1.0 I think. Don't know if this cause much problem. |
Hello @huangyuxiang03 we've found a regression in our LaTeX parser and bumping to the new version should fix the discrepancy:
Please let me know if that works! |
Hello @lewtun after Successfully installed latex2sympy2_extended-1.0.6 deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B on math-500 is aime_24 is as following: |
Hmm that is odd: after merging #196 I am able to get
Can you please update to |
Hi, could you share your training parameters? I used the official script and my score was relatively low. |
@lewtun |
@lewtun Could you please update evaluation for the code generation. |
After installing
Thank you all for the help. Since my problem is solved, I'm closing this issue. |
Hi,
Thanks for your effort!
When I evalutate
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
using the provided code in this repo on math-500, I cannot reproduce the reported performance. I'm only getting a 0.756 but the reported score of open-r1 is 0.816 and deepseek reports 0.839 in their technical report. The script I'm using is provided below:Thanks for looking into this issue. Appreciate your work again!
The text was updated successfully, but these errors were encountered: