Skip to content

Conversation

@cyyever
Copy link
Contributor

@cyyever cyyever commented Aug 21, 2025

What does this PR do?

Follows #40068

@cyyever cyyever marked this pull request as draft August 21, 2025 16:18
@cyyever cyyever force-pushed the missing_parameter2 branch 2 times, most recently from 0243a3f to 486ed34 Compare August 21, 2025 16:26
@cyyever cyyever changed the title Add missing arguments Add more missing arguments Aug 21, 2025
@cyyever cyyever force-pushed the missing_parameter2 branch 18 times, most recently from 3377862 to 778911a Compare August 22, 2025 06:59
@cyyever cyyever marked this pull request as ready for review August 22, 2025 07:01
@cyyever cyyever force-pushed the missing_parameter2 branch 5 times, most recently from 978c4a6 to ca07631 Compare August 22, 2025 10:25
@cyyever cyyever force-pushed the missing_parameter2 branch from ca07631 to 0c8add6 Compare August 22, 2025 12:48

def __init__(self, config: Cohere2Config, layer_idx: Optional[int] = None):
super().__init__()
nn.Module.__init__(self)
Copy link
Member

@Rocketknight1 Rocketknight1 Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question! In some cases, you're replacing super().__init__() with direct calls to the parent class, like here, but in other cases you're doing the opposite, like DeepseekVLImageProcessorFast().__init__(**kwargs) being replaced by super().__init__(**kwargs). I think we generally prefer the super() form unless there's a strong reason not to use it. Why replace super() with nn.Module?

Copy link
Contributor Author

@cyyever cyyever Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because some modulars don't want to call the parent constructor because the members of the parent constructor are immediately rewritten below. That is why some modulars use multiple inheritance with an additional nn.Module to do just the initialization of nn.Module. However, nn.Module is already the gradparent, so why not directly call it?

In other cases, the constructor wants to call the parent constructor but the syntax is wrong.

So I considered two possibilities in each case and decided how to fix...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging this as a potential modular issue, or something we should give guidance on when writing modular files. cc @ArthurZucker @Cyrilvallez

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would to add the missing PyLint check after fixing all these to prevent future errors.

Copy link
Member

@Cyrilvallez Cyrilvallez Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are special syntaxes - super() in modular unravels the parent's code, while Parent.__init__ should be replaced by super. I see this is not the case, as modular does not correctly replace Parent.__init__
I will fix it, but until then please hold on to this PR!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes are in modular files. The changes of converter code may not apply to them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged the PR that should fix it, feel free to rebase on it! Ping me once it's done so that I can check!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, you should be able to apply the changes to the modulars (i.e. add the missing self arg and remove redundant parent classes), but the modeling files should not change after applying the converter. If they do and it's not intended, something is still wrong

Copy link
Contributor Author

@cyyever cyyever Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cyrilvallez There remain some changes. I believe these are missing corner cases in your recent PRs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I don't see any in your PR... Only the modular are changed

@cyyever cyyever force-pushed the missing_parameter2 branch 6 times, most recently from 5e59de6 to 5f935bd Compare August 27, 2025 14:14
@cyyever
Copy link
Contributor Author

cyyever commented Aug 27, 2025

@Cyrilvallez Ping

@cyyever cyyever force-pushed the missing_parameter2 branch 3 times, most recently from 6ed821c to 611c8b8 Compare August 28, 2025 00:28
Signed-off-by: cyy <cyyever@outlook.com>
@cyyever cyyever force-pushed the missing_parameter2 branch from 611c8b8 to b6b59d0 Compare August 28, 2025 00:35
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: bamba, cohere2, d_fine, data2vec, dia, dots1, ernie4_5, evolla, falcon_h1, falcon_mamba, florence2, gemma, gemma2, gemma3, gemma3n, glm4_moe

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@Cyrilvallez Cyrilvallez merged commit 10ddfb0 into huggingface:main Aug 28, 2025
16 checks passed
@cyyever cyyever deleted the missing_parameter2 branch September 1, 2025 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants