Skip to content

Conversation

@zRzRzRzRzRzRzR
Copy link
Contributor

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR commented Jul 31, 2025

This PR aims to upgrade the implementation of GLM-4.1V to be compatible with GLM-V implementations that use different base models but share the same VIT.
The current PR is incorrect, as discussed on Slack, because the transformers library architectures do not appear in the text config.

@mergify mergify bot added the new-model Requests to new models label Jul 31, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to make the GLM-4.1V implementation more flexible by supporting different base language models. However, the current changes hardcode a specific language model architecture, which contradicts the stated goal. Additionally, there's an incorrect entry in the model registry that could cause issues. My review focuses on these critical issues and provides suggestions to address them, ensuring the implementation is robust and achieves its intended purpose.

"GlmForCausalLM": ("glm", "GlmForCausalLM"),
"Glm4ForCausalLM": ("glm4", "Glm4ForCausalLM"),
"Glm4MoeForCausalLM": ("glm4_moe", "Glm4MoeForCausalLM"),
"Glm4v_moeForConditionalGeneration": ("glm4v_moe_text", "Glm4MoeForCausalLM"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change incorrectly registers a multimodal architecture Glm4v_moeForConditionalGeneration as a text-generation model. This architecture is already correctly registered as a multimodal model in _MULTIMODAL_MODELS. This entry is likely a mistake and could cause conflicts or unexpected behavior. The module glm4v_moe_text also does not appear to exist in the codebase. This line should be removed.

Comment on lines 1272 to 1276
self.language_model = init_vllm_registered_model(
vllm_config=vllm_config,
prefix=maybe_prefix(prefix, ""),
architectures=["Glm4ForCausalLM"],
hf_config=self.config.get_text_config(),
)
hf_config=config.text_config,
prefix=maybe_prefix(prefix, "language_model"),
architectures=["Glm4MoeForCausalLM"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This block introduces two main issues:

  1. Incorrect hf_config type: The hf_config argument is set to config.text_config. If config is a transformers.Glm4vConfig object, config.text_config is a dictionary, but init_vllm_registered_model expects a transformers.PretrainedConfig object. This will likely cause a TypeError. The previous implementation correctly used config.get_text_config(), which returns a PretrainedConfig object.

  2. Hardcoded Architecture: The architectures parameter is hardcoded to ["Glm4MoeForCausalLM"]. This contradicts the PR's goal of supporting different base models. The architecture should be determined dynamically from the model's configuration, for example from config.text_config.architectures.

To address these issues, you should ensure hf_config is a PretrainedConfig object and avoid hardcoding the architecture. This will make the implementation more robust and aligned with the PR's objective.

        text_config = (config.get_text_config()
                       if hasattr(config, "get_text_config") else
                       config.text_config)

        # The `text_config` from a multi-modal model's config is often a
        # dictionary. We need to convert it to a PretrainedConfig object.
        if isinstance(text_config, dict):
            from transformers import AutoConfig
            text_config = AutoConfig.from_dict(text_config,
                                               trust_remote_code=True)

        self.language_model = init_vllm_registered_model(
            vllm_config=vllm_config,
            hf_config=text_config,
            prefix=maybe_prefix(prefix, "language_model"))

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@Isotr0py Isotr0py self-assigned this Jul 31, 2025
Copy link
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments. Looking into the text backbone initialization issue.

Isotr0py and others added 6 commits July 31, 2025 23:02
@Isotr0py Isotr0py enabled auto-merge (squash) August 1, 2025 12:21
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 1, 2025
@Isotr0py
Copy link
Member

Isotr0py commented Aug 1, 2025

BTW, you can enable automatic sign-off in pycharm: https://docs.vllm.ai/en/latest/contributing/index.html?h=pycharm#dco-and-signed-off-by

"Gemma3ForConditionalGeneration": ("gemma3_mm", "Gemma3ForConditionalGeneration"), # noqa: E501
"GLM4VForCausalLM": ("glm4v", "GLM4VForCausalLM"),
"Glm4vForConditionalGeneration": ("glm4_1v", "Glm4vForConditionalGeneration"), # noqa: E501
"Glm4v_moeForConditionalGeneration": ("glm4_1v", "Glm4vForConditionalGeneration"), # noqa: E501
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly forgotten, can you also update tests/models/registry.py and docs/models/supported_models.md for Glm4v_moeForConditionalGeneration?

Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
@DarkLight1337
Copy link
Member

Please also merge from main to fix the CI

auto-merge was automatically disabled August 2, 2025 03:19

Head branch was pushed to by a user without write access

@mergify mergify bot added the documentation Improvements or additions to documentation label Aug 2, 2025
@Isotr0py Isotr0py enabled auto-merge (squash) August 2, 2025 03:32
@Isotr0py
Copy link
Member

Isotr0py commented Aug 2, 2025

Seems the failing multimodal tests are related to the registry renaming: https://buildkite.com/vllm/ci/builds/25800#019868cb-5bbe-41fc-be3b-f86a7d8edfc9

Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
auto-merge was automatically disabled August 2, 2025 06:22

Head branch was pushed to by a user without write access

@mergify mergify bot added the tool-calling label Aug 2, 2025
@zRzRzRzRzRzRzR
Copy link
Contributor Author

zRzRzRzRzRzRzR commented Aug 2, 2025

Have the GLM model related tests have now passed ?

@DarkLight1337
Copy link
Member

Yes, merging

@vllm-bot vllm-bot merged commit 25373b6 into vllm-project:main Aug 2, 2025
38 of 44 checks passed
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
wenbinc-Bin pushed a commit to wenbinc-Bin/vllm-fork that referenced this pull request Aug 7, 2025
Cherry-pick: vllm-project@25373b6

Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Noam Gat <noamgat@gmail.com>
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Paul Pak <paulpak58@gmail.com>
wenbinc-Bin pushed a commit to wenbinc-Bin/vllm-fork that referenced this pull request Aug 14, 2025
Cherry-pick: vllm-project@25373b6

Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Diego-Castan <diego.castan@ibm.com>
HeJunyan added a commit to HeJunyan/vllm-fork that referenced this pull request Aug 20, 2025
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
HeJunyan added a commit to HeJunyan/vllm-fork that referenced this pull request Sep 22, 2025
HeJunyan added a commit to HeJunyan/vllm-fork that referenced this pull request Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants