Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

冻结vision tower剩下全参数一起DPO训练后推理乱码,如何进行DPO推理 #5965

Open
Sisi0518 opened this issue Nov 8, 2024 · 0 comments
Labels
pending This problem is yet to be addressed

Comments

@Sisi0518
Copy link

Sisi0518 commented Nov 8, 2024

作者你好,我冻结冻结vision tower剩下全参数一起DPO训练后, 直接load模型参数
model_org = Qwen2VLForConditionalGeneration.from_pretrained(
model_path,
torch_dtype="auto",
attn_implementation="flash_attention_2",
device_map="auto",
)

直接预测为乱码
我的dpo训练参数是:
--deepspeed examples/deepspeed/ds_z0_config.json
--stage dpo
--do_train
--model_name_or_path 1030_mapgpt_mesh996_patchexpand_refer_norefer_aug_lr2e-5/checkpoint-14000
--dataset mapgpt_refer_dpo
--output_dir workdir/1107_simpo
--learning_rate 1e-5
--template qwen2_vl
--finetuning_type full
--freeze_vision_tower true
--pref_beta 0.1
--pref_loss simpo
--overwrite_cache
--overwrite_output_dir
--warmup_steps 100
--weight_decay 0.1
--preprocessing_num_workers 32
--per_device_train_batch_size 1
--gradient_accumulation_steps 4
--ddp_timeout 900000000
--lr_scheduler_type cosine
--logging_steps 1
--cutoff_len 14000
--save_steps 1000
--save_total_limit 100
--plot_loss
--num_train_epochs 10
--bf16

请问我应该如何加载DPO训练权重呢?

@github-actions github-actions bot added the pending This problem is yet to be addressed label Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant