-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
for glm-4.1V update #22000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
for glm-4.1V update #22000
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to make the GLM-4.1V implementation more flexible by supporting different base language models. However, the current changes hardcode a specific language model architecture, which contradicts the stated goal. Additionally, there's an incorrect entry in the model registry that could cause issues. My review focuses on these critical issues and provides suggestions to address them, ensuring the implementation is robust and achieves its intended purpose.
| "GlmForCausalLM": ("glm", "GlmForCausalLM"), | ||
| "Glm4ForCausalLM": ("glm4", "Glm4ForCausalLM"), | ||
| "Glm4MoeForCausalLM": ("glm4_moe", "Glm4MoeForCausalLM"), | ||
| "Glm4v_moeForConditionalGeneration": ("glm4v_moe_text", "Glm4MoeForCausalLM"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change incorrectly registers a multimodal architecture Glm4v_moeForConditionalGeneration as a text-generation model. This architecture is already correctly registered as a multimodal model in _MULTIMODAL_MODELS. This entry is likely a mistake and could cause conflicts or unexpected behavior. The module glm4v_moe_text also does not appear to exist in the codebase. This line should be removed.
| self.language_model = init_vllm_registered_model( | ||
| vllm_config=vllm_config, | ||
| prefix=maybe_prefix(prefix, ""), | ||
| architectures=["Glm4ForCausalLM"], | ||
| hf_config=self.config.get_text_config(), | ||
| ) | ||
| hf_config=config.text_config, | ||
| prefix=maybe_prefix(prefix, "language_model"), | ||
| architectures=["Glm4MoeForCausalLM"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block introduces two main issues:
-
Incorrect
hf_configtype: Thehf_configargument is set toconfig.text_config. Ifconfigis atransformers.Glm4vConfigobject,config.text_configis a dictionary, butinit_vllm_registered_modelexpects atransformers.PretrainedConfigobject. This will likely cause aTypeError. The previous implementation correctly usedconfig.get_text_config(), which returns aPretrainedConfigobject. -
Hardcoded Architecture: The
architecturesparameter is hardcoded to["Glm4MoeForCausalLM"]. This contradicts the PR's goal of supporting different base models. The architecture should be determined dynamically from the model's configuration, for example fromconfig.text_config.architectures.
To address these issues, you should ensure hf_config is a PretrainedConfig object and avoid hardcoding the architecture. This will make the implementation more robust and aligned with the PR's objective.
text_config = (config.get_text_config()
if hasattr(config, "get_text_config") else
config.text_config)
# The `text_config` from a multi-modal model's config is often a
# dictionary. We need to convert it to a PretrainedConfig object.
if isinstance(text_config, dict):
from transformers import AutoConfig
text_config = AutoConfig.from_dict(text_config,
trust_remote_code=True)
self.language_model = init_vllm_registered_model(
vllm_config=vllm_config,
hf_config=text_config,
prefix=maybe_prefix(prefix, "language_model"))|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Isotr0py
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments. Looking into the text backbone initialization issue.
Signed-off-by: Isotr0py <2037008807@qq.com>
|
BTW, you can enable automatic sign-off in pycharm: https://docs.vllm.ai/en/latest/contributing/index.html?h=pycharm#dco-and-signed-off-by |
| "Gemma3ForConditionalGeneration": ("gemma3_mm", "Gemma3ForConditionalGeneration"), # noqa: E501 | ||
| "GLM4VForCausalLM": ("glm4v", "GLM4VForCausalLM"), | ||
| "Glm4vForConditionalGeneration": ("glm4_1v", "Glm4vForConditionalGeneration"), # noqa: E501 | ||
| "Glm4v_moeForConditionalGeneration": ("glm4_1v", "Glm4vForConditionalGeneration"), # noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly forgotten, can you also update tests/models/registry.py and docs/models/supported_models.md for Glm4v_moeForConditionalGeneration?
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
|
Please also merge from main to fix the CI |
Head branch was pushed to by a user without write access
|
Seems the failing multimodal tests are related to the registry renaming: https://buildkite.com/vllm/ci/builds/25800#019868cb-5bbe-41fc-be3b-f86a7d8edfc9 |
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Head branch was pushed to by a user without write access
|
Have the GLM model related tests have now passed ? |
|
Yes, merging |
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>
Cherry-pick: vllm-project@25373b6 Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Noam Gat <noamgat@gmail.com>
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>
Cherry-pick: vllm-project@25373b6 Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>
This PR aims to upgrade the implementation of GLM-4.1V to be compatible with GLM-V implementations that use different base models but share the same VIT.
The current PR is incorrect, as discussed on Slack, because the transformers library architectures do not appear in the text config.