Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify Tensor Parallel implementation with PyTorch TP #34184
Simplify Tensor Parallel implementation with PyTorch TP #34184
Changes from 15 commits
e60fb87
fd7f7c7
9224cab
79cc524
a2934b3
a8fc418
e84a388
396d158
7b346b5
d60679b
dda058a
12fbbe7
02c8c39
073c521
db6e5ee
5bb294e
290a7f1
bd2e89c
4892cef
9648f31
93ba283
73524c9
f312e55
ca93bdb
dc2672f
1e27d6f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, the reason for
self.base_model is self
is that we are attaching the base_model_tp_plan to the base model only. If we attach it toLlamaForCausalLM
, the FQNs won't match, because thebase_model_tp_plan
FQNs start with "layers", not "model.layers".There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but
self._tp_plan
should be None only for the base model no?We could maybe add something like
if not self.base_model is self and self._tp_plan is None and self.supports_tp_plan
raise an error in the futur, to force people to add the TP plan.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not always the case. For example,
LlamaForSequenceClassification
andLlamaForQuestionAnswering
have_tp_plan=None
(at their top level), whileLlamaForCausalLM
has a_tp_plan = {"lm_head": ...}
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, indeed. I thiught we should enforce TP plan definition for all classes to avoid user errors but its fine like this!