for glm-4.1V update #22000

zRzRzRzRzRzRzR · 2025-07-31T08:27:13Z

This PR aims to upgrade the implementation of GLM-4.1V to be compatible with GLM-V implementations that use different base models but share the same VIT.
The current PR is incorrect, as discussed on Slack, because the transformers library architectures do not appear in the text config.

gemini-code-assist

Code Review

This pull request aims to make the GLM-4.1V implementation more flexible by supporting different base language models. However, the current changes hardcode a specific language model architecture, which contradicts the stated goal. Additionally, there's an incorrect entry in the model registry that could cause issues. My review focuses on these critical issues and provides suggestions to address them, ensuring the implementation is robust and achieves its intended purpose.

gemini-code-assist · 2025-07-31T08:33:28Z

vllm/model_executor/models/registry.py

    "GlmForCausalLM": ("glm", "GlmForCausalLM"),
    "Glm4ForCausalLM": ("glm4", "Glm4ForCausalLM"),
    "Glm4MoeForCausalLM": ("glm4_moe", "Glm4MoeForCausalLM"),
+    "Glm4v_moeForConditionalGeneration": ("glm4v_moe_text", "Glm4MoeForCausalLM"),


This change incorrectly registers a multimodal architecture Glm4v_moeForConditionalGeneration as a text-generation model. This architecture is already correctly registered as a multimodal model in _MULTIMODAL_MODELS. This entry is likely a mistake and could cause conflicts or unexpected behavior. The module glm4v_moe_text also does not appear to exist in the codebase. This line should be removed.

gemini-code-assist · 2025-07-31T08:33:28Z

vllm/model_executor/models/glm4_1v.py

        self.language_model = init_vllm_registered_model(
            vllm_config=vllm_config,
-            prefix=maybe_prefix(prefix, ""),
-            architectures=["Glm4ForCausalLM"],
-            hf_config=self.config.get_text_config(),
-        )
+            hf_config=config.text_config,
+            prefix=maybe_prefix(prefix, "language_model"),
+            architectures=["Glm4MoeForCausalLM"])


This block introduces two main issues:

Incorrect hf_config type: The hf_config argument is set to config.text_config. If config is a transformers.Glm4vConfig object, config.text_config is a dictionary, but init_vllm_registered_model expects a transformers.PretrainedConfig object. This will likely cause a TypeError. The previous implementation correctly used config.get_text_config(), which returns a PretrainedConfig object.

Hardcoded Architecture: The architectures parameter is hardcoded to ["Glm4MoeForCausalLM"]. This contradicts the PR's goal of supporting different base models. The architecture should be determined dynamically from the model's configuration, for example from config.text_config.architectures.

To address these issues, you should ensure hf_config is a PretrainedConfig object and avoid hardcoding the architecture. This will make the implementation more robust and aligned with the PR's objective.

text_config = (config.get_text_config() if hasattr(config, "get_text_config") else config.text_config) # The `text_config` from a multi-modal model's config is often a # dictionary. We need to convert it to a PretrainedConfig object. if isinstance(text_config, dict): from transformers import AutoConfig text_config = AutoConfig.from_dict(text_config, trust_remote_code=True) self.language_model = init_vllm_registered_model( vllm_config=vllm_config, hf_config=text_config, prefix=maybe_prefix(prefix, "language_model"))

github-actions · 2025-07-31T09:18:02Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Isotr0py

Some initial comments. Looking into the text backbone initialization issue.

vllm/model_executor/models/glm4_1v.py

vllm/model_executor/models/registry.py

Signed-off-by: Isotr0py <2037008807@qq.com>

vllm/model_executor/layers/rotary_embedding.py

Isotr0py · 2025-08-01T12:23:31Z

BTW, you can enable automatic sign-off in pycharm: https://docs.vllm.ai/en/latest/contributing/index.html?h=pycharm#dco-and-signed-off-by

Isotr0py · 2025-08-01T12:27:09Z

vllm/model_executor/models/registry.py

    "Gemma3ForConditionalGeneration": ("gemma3_mm", "Gemma3ForConditionalGeneration"),  # noqa: E501
    "GLM4VForCausalLM": ("glm4v", "GLM4VForCausalLM"),
    "Glm4vForConditionalGeneration": ("glm4_1v", "Glm4vForConditionalGeneration"),  # noqa: E501
+    "Glm4v_moeForConditionalGeneration": ("glm4_1v", "Glm4vForConditionalGeneration"),  # noqa: E501


Nearly forgotten, can you also update tests/models/registry.py and docs/models/supported_models.md for Glm4v_moeForConditionalGeneration?

Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>

DarkLight1337 · 2025-08-01T15:19:04Z

Please also merge from main to fix the CI

Isotr0py · 2025-08-02T04:50:13Z

Seems the failing multimodal tests are related to the registry renaming: https://buildkite.com/vllm/ci/builds/25800#019868cb-5bbe-41fc-be3b-f86a7d8edfc9

Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>

zRzRzRzRzRzRzR · 2025-08-02T07:36:26Z

Have the GLM model related tests have now passed ?

DarkLight1337 · 2025-08-02T08:43:56Z

Yes, merging

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>

Cherry-pick: vllm-project@25373b6 Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Noam Gat <noamgat@gmail.com>

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

Cherry-pick: vllm-project@25373b6 Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>

for glm-4.1V update

20163a9

mergify bot added the new-model Requests to new models label Jul 31, 2025

gemini-code-assist bot reviewed Jul 31, 2025

View reviewed changes

Isotr0py self-assigned this Jul 31, 2025

Isotr0py reviewed Jul 31, 2025

View reviewed changes

vllm/model_executor/models/glm4_1v.py Show resolved Hide resolved

vllm/model_executor/models/glm4_1v.py Show resolved Hide resolved

vllm/model_executor/models/registry.py Outdated Show resolved Hide resolved

Isotr0py and others added 6 commits July 31, 2025 23:02

resolve glm4v moe

19e0574

Signed-off-by: Isotr0py <2037008807@qq.com>

ooops

017817f

Signed-off-by: Isotr0py <2037008807@qq.com>

update

2a865bd

Update glm4_1v.py

d7127f6

update

b3da3c2

Delete 3-rollout_qas_thinking_answers.py 12-52-37-180.py

0b7f8ef

Isotr0py reviewed Aug 1, 2025

View reviewed changes

vllm/model_executor/layers/rotary_embedding.py Outdated Show resolved Hide resolved

Update rotary_embedding.py

9cc0bb1

Isotr0py approved these changes Aug 1, 2025

View reviewed changes

Isotr0py enabled auto-merge (squash) August 1, 2025 12:21

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 1, 2025

Isotr0py reviewed Aug 1, 2025

View reviewed changes

update doc

92b49d4

Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>

auto-merge was automatically disabled August 2, 2025 03:19
Head branch was pushed to by a user without write access

zRzRzRzRzRzRzR requested review from DarkLight1337, hmellor and ywang96 as code owners August 2, 2025 03:19

Merge branch 'vllm-project:main' into glm-45

d0f74ff

mergify bot added the documentation Improvements or additions to documentation label Aug 2, 2025

Isotr0py enabled auto-merge (squash) August 2, 2025 03:32

use same name for GLM-4.5 and GLM-4.5V

f0f1aca

Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>

auto-merge was automatically disabled August 2, 2025 06:22
Head branch was pushed to by a user without write access

mergify bot added the tool-calling label Aug 2, 2025

github-project-automation bot added this to Tool Calling Aug 2, 2025

vllm-bot merged commit 25373b6 into vllm-project:main Aug 2, 2025
38 of 44 checks passed

github-project-automation bot moved this to Done in Tool Calling Aug 2, 2025

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

for glm-4.1V update (vllm-project#22000)

a8deb8f

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>

HeJunyan added a commit to HeJunyan/vllm-fork that referenced this pull request Aug 20, 2025

for glm-4.1V update vllm-project#22000

a0caab1

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

for glm-4.1V update (vllm-project#22000)

ed57ed0

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

for glm-4.1V update (vllm-project#22000)

2cdac94

Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>

HeJunyan added a commit to HeJunyan/vllm-fork that referenced this pull request Sep 22, 2025

for glm-4.1V update vllm-project#22000

1a8c6a6

HeJunyan added a commit to HeJunyan/vllm-fork that referenced this pull request Oct 30, 2025

for glm-4.1V update vllm-project#22000

e102cab

Uh oh!

for glm-4.1V update #22000

for glm-4.1V update #22000

Uh oh!

Conversation

zRzRzRzRzRzRzR commented Jul 31, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Isotr0py commented Aug 1, 2025

Uh oh!

Isotr0py Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Aug 1, 2025

Uh oh!

Isotr0py commented Aug 2, 2025

Uh oh!

zRzRzRzRzRzRzR commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented Aug 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zRzRzRzRzRzRzR commented Jul 31, 2025 •

edited by github-actions bot

Loading

zRzRzRzRzRzRzR commented Aug 2, 2025 •

edited

Loading