-
Notifications
You must be signed in to change notification settings - Fork 942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributed mlx_lm.evaluate
#1174
base: main
Are you sure you want to change the base?
Conversation
This is great! I'm testing it with M2 Ultra + 2 M4 Max. WOW! Great job @barronalex |
Any news on this PR? It would be great to speed up some distributed evals on DeepSeek R1 😜 |
That’s awesome! I’ll get it in later today. |
@awni thanks for the comments! I think this is good to merge now. |
@@ -346,11 +361,8 @@ def main(): | |||
) | |||
parser.add_argument( | |||
"--apply-chat-template", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was impossible to disable this before, so I've changed it to be off by default (which mirrors the lm_eval
behavior)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about defaulting it to off for instruct models.. it seems like you would always want this on for most models that are used regularly? Does it make sense to change this to --ignore-chat-template
instead to be able to shut it off if needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just one comment. Let me know what you think. Otherwise LGTM!
Add a distributed version of
mlx_lm.evaluate
that runs on multiple nodes and produces identical outputs.Also fix a few bugs:
batch_size
no longer affects the outputloglikelihood_rolling
tasks, e.g. wiki textOn 1 M2 Ultra:
On 4 M2 Ultra: