qwen2vl全量微调后进行评估报错 #6582

gsfsdv · 2025-01-09T12:59:32Z

Description

作者您好，我全量微调的qwen2vl后，使用llamafactory-cli train examples/extras/nlg_eval/qwen.yaml进行评估

model_name_or_path: /mnt/dolphinfs/checkpoint-846

method

do_predict: true
stage: sft
finetuning_type: full

dataset

eval_dataset: text2textsftEval
dataset_dir: /mnt/dolphinfs/results
template: qwen2_vl
cutoff_len: 8192
max_samples: 2
max_new_tokens: 8192
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: saves/qwen/
overwrite_output_dir: true

eval

per_device_eval_batch_size: 1
predict_with_generate: true
ddp_timeout: 180000000

但是有错：
File "/home/hadoop-aipnlp/.local/lib/python3.9/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 582, in forward
attn_weights = attn_weights + causal_mask
RuntimeError: The size of tensor a (792) must match the size of tensor b (397) at non-singleton dimension 3

Pull Request

No response

gsfsdv added the enhancement New feature or request label Jan 9, 2025

github-actions bot added the pending This problem is yet to be addressed label Jan 9, 2025

hiyouga removed the enhancement New feature or request label Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen2vl全量微调后进行评估报错 #6582

qwen2vl全量微调后进行评估报错 #6582

gsfsdv commented Jan 9, 2025

qwen2vl全量微调后进行评估报错 #6582

qwen2vl全量微调后进行评估报错 #6582

Comments

gsfsdv commented Jan 9, 2025

Description

method

dataset

output

eval

Pull Request