Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qwen2vl全量微调后进行评估报错 #6582

Open
gsfsdv opened this issue Jan 9, 2025 · 0 comments
Open

qwen2vl全量微调后进行评估报错 #6582

gsfsdv opened this issue Jan 9, 2025 · 0 comments
Labels
pending This problem is yet to be addressed

Comments

@gsfsdv
Copy link

gsfsdv commented Jan 9, 2025

Description

作者您好,我全量微调的qwen2vl后,使用llamafactory-cli train examples/extras/nlg_eval/qwen.yaml进行评估

model_name_or_path: /mnt/dolphinfs/checkpoint-846

method

do_predict: true
stage: sft
finetuning_type: full

dataset

eval_dataset: text2textsftEval
dataset_dir: /mnt/dolphinfs/results
template: qwen2_vl
cutoff_len: 8192
max_samples: 2
max_new_tokens: 8192
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: saves/qwen/
overwrite_output_dir: true

eval

per_device_eval_batch_size: 1
predict_with_generate: true
ddp_timeout: 180000000

但是有错:
File "/home/hadoop-aipnlp/.local/lib/python3.9/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 582, in forward
attn_weights = attn_weights + causal_mask
RuntimeError: The size of tensor a (792) must match the size of tensor b (397) at non-singleton dimension 3

Pull Request

No response

@gsfsdv gsfsdv added the enhancement New feature or request label Jan 9, 2025
@github-actions github-actions bot added the pending This problem is yet to be addressed label Jan 9, 2025
@hiyouga hiyouga removed the enhancement New feature or request label Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

2 participants