[LNL][Cogagent] RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0 #12646

juan-OY · 2025-01-03T07:12:58Z

Model：https://huggingface.co/THUDM/cogagent-9b-20241220
CogAgent-9B-20241220 model is based on GLM-4V-9B， but i fail to run this CogAgent-9B-20241220

Setup guide follows： https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HuggingFace/Multimodal/glm-4v
env:
ipex-llm 2.2.0b20250102
transformers tried both 4.42.4 & 4.47.1

Failure as below:
Traceback (most recent call last):
File "D:\cogagent\generate.py", line 75, in
inputs = tokenizer.apply_chat_template([{"role": "user", "image": image, "content": query}],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\test.cache\huggingface\modules\transformers_modules\cogagent-9b-20241220\tokenization_chatglm.py", line 232, in apply_chat_template
result = handle_single_conversation(conversation)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\test.cache\huggingface\modules\transformers_modules\cogagent-9b-20241220\tokenization_chatglm.py", line 200, in handle_single_conversation
input_image = transform(item["image"])
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\test\miniforge3\envs\cogagent\Lib\site-packages\torchvision\transforms\transforms.py", line 95, in call
img = t(img)
^^^^^^
File "C:\Users\test\miniforge3\envs\cogagent\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self.call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\test\miniforge3\envs\cogagent\Lib\site-packages\torch\nn\modules\module.py", line 1541, in call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\test\miniforge3\envs\cogagent\Lib\site-packages\torchvision\transforms\transforms.py", line 277, in forward
return F.normalize(tensor, self.mean, self.std, self.inplace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\test\miniforge3\envs\cogagent\Lib\site-packages\torchvision\transforms\functional.py", line 350, in normalize
return F_t.normalize(tensor, mean=mean, std=std, inplace=inplace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\test\miniforge3\envs\cogagent\Lib\site-packages\torchvision\transforms_functional_tensor.py", line 926, in normalize
return tensor.sub(mean).div(std)
^^^^^^^^^^^^^^^^^
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

qiuxin2012 · 2025-01-06T00:55:23Z

The cogagent's prompt concatenation has strict limits, our glm-4v-9B's example can't meet their requirement. Have you changed generate.py followed by cogagent's requirements?
You can also reference their example https://github.com/THUDM/CogAgent/blob/main/inference/cli_demo.py

juan-OY · 2025-01-06T01:11:41Z

It is not with the format, it also fails with running https://github.com/THUDM/CogAgent/blob/main/inference/cli_demo.py or web_demo.py, it fails with the vision part.

juan-OY · 2025-01-06T01:34:58Z

It reports error as (web_demo.py):

File "C:\Users\test.cache\huggingface\modules\transformers_modules\cogagent-9b-20241220\visual.py", line 193, in forward
x = x.view(b, grid_size, grid_size, h).permute(0, 3, 1, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[1, 80, 80, 1792]' is invalid for input of size 11470592

qiuxin2012 assigned lzivan Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LNL][Cogagent] RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0 #12646

[LNL][Cogagent] RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0 #12646

juan-OY commented Jan 3, 2025

qiuxin2012 commented Jan 6, 2025

juan-OY commented Jan 6, 2025

juan-OY commented Jan 6, 2025

[LNL][Cogagent] RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0 #12646

[LNL][Cogagent] RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0 #12646

Comments

juan-OY commented Jan 3, 2025

qiuxin2012 commented Jan 6, 2025

juan-OY commented Jan 6, 2025

juan-OY commented Jan 6, 2025