-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Error(s) in loading state_dict for InternViTModel/GPTVLModel #8
Comments
HI @Vintage-Echo
The vision model parts are the same in the Megatron and Modellink training framework. |
Yes, I have used this script before, but the problem still occurs. Is the environment configuration incorrect? When I used the docker
Where should I start to locate the problem? Looking forward to your reply, thanks a lot. |
I have verified this script and it works well. |
Yes, the problem might be caused by the loading of Qwen2.5-14B-Instruct_tp8pp1_te. The log file is as follows:
I temporarily bypass this problem by setting it to null, but I still do not know why the error is reported. |
It seems the model is not built with Transformer Engine, but the weights are converted for the Transformer Engine. |
Great job!
When I try to perform ViT weight conversion on the GPU-Megatron framework, I do not find the corresponding script.
> bash scripts/megatron/convert_model_intern_vit.sh
It’s seem to have some problems when I use other scripts. Could you provide the verified script for internViT-convert? Thanks!
> RuntimeError: Error(s) in loading state_dict for InternViTModel
The text was updated successfully, but these errors were encountered: