-
Notifications
You must be signed in to change notification settings - Fork 31.1k
Add more missing arguments #40354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more missing arguments #40354
Conversation
0243a3f to
486ed34
Compare
3377862 to
778911a
Compare
978c4a6 to
ca07631
Compare
ca07631 to
0c8add6
Compare
|
|
||
| def __init__(self, config: Cohere2Config, layer_idx: Optional[int] = None): | ||
| super().__init__() | ||
| nn.Module.__init__(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question! In some cases, you're replacing super().__init__() with direct calls to the parent class, like here, but in other cases you're doing the opposite, like DeepseekVLImageProcessorFast().__init__(**kwargs) being replaced by super().__init__(**kwargs). I think we generally prefer the super() form unless there's a strong reason not to use it. Why replace super() with nn.Module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because some modulars don't want to call the parent constructor because the members of the parent constructor are immediately rewritten below. That is why some modulars use multiple inheritance with an additional nn.Module to do just the initialization of nn.Module. However, nn.Module is already the gradparent, so why not directly call it?
In other cases, the constructor wants to call the parent constructor but the syntax is wrong.
So I considered two possibilities in each case and decided how to fix...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flagging this as a potential modular issue, or something we should give guidance on when writing modular files. cc @ArthurZucker @Cyrilvallez
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would to add the missing PyLint check after fixing all these to prevent future errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are special syntaxes - super() in modular unravels the parent's code, while Parent.__init__ should be replaced by super. I see this is not the case, as modular does not correctly replace Parent.__init__
I will fix it, but until then please hold on to this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes are in modular files. The changes of converter code may not apply to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged the PR that should fix it, feel free to rebase on it! Ping me once it's done so that I can check!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, you should be able to apply the changes to the modulars (i.e. add the missing self arg and remove redundant parent classes), but the modeling files should not change after applying the converter. If they do and it's not intended, something is still wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cyrilvallez There remain some changes. I believe these are missing corner cases in your recent PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I don't see any in your PR... Only the modular are changed
5e59de6 to
5f935bd
Compare
|
@Cyrilvallez Ping |
6ed821c to
611c8b8
Compare
Signed-off-by: cyy <cyyever@outlook.com>
611c8b8 to
b6b59d0
Compare
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bamba, cohere2, d_fine, data2vec, dia, dots1, ernie4_5, evolla, falcon_h1, falcon_mamba, florence2, gemma, gemma2, gemma3, gemma3n, glm4_moe |
Cyrilvallez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
What does this PR do?
Follows #40068