-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
Support expert parallel load balancing in Transformers backend #26287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables expert parallel load balancing (EPLB) for Mixture-of-Experts models using the Transformers backend. This is achieved by implementing the MixtureOfExperts interface in TransformersMoEBase and adding the necessary state and methods for EPLB to function. The changes also include some nice refactoring in transformers.py to consistently use process group objects.
I've found one critical issue in the implementation of update_physical_experts_metadata that would prevent dynamic load balancing from working correctly. Please see my specific comment for details.
Isotr0py
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…project#26287) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Karan Goel <3261985+karan@users.noreply.github.com>
…project#26287) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…project#26287) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…project#26287) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…project#26287) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…project#26287) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…project#26287) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
MixtureOfExpertsmixin toTransformersMoEBase