We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者你好,我冻结冻结vision tower剩下全参数一起DPO训练后, 直接load模型参数 model_org = Qwen2VLForConditionalGeneration.from_pretrained( model_path, torch_dtype="auto", attn_implementation="flash_attention_2", device_map="auto", )
直接预测为乱码 我的dpo训练参数是: --deepspeed examples/deepspeed/ds_z0_config.json --stage dpo --do_train --model_name_or_path 1030_mapgpt_mesh996_patchexpand_refer_norefer_aug_lr2e-5/checkpoint-14000 --dataset mapgpt_refer_dpo --output_dir workdir/1107_simpo --learning_rate 1e-5 --template qwen2_vl --finetuning_type full --freeze_vision_tower true --pref_beta 0.1 --pref_loss simpo --overwrite_cache --overwrite_output_dir --warmup_steps 100 --weight_decay 0.1 --preprocessing_num_workers 32 --per_device_train_batch_size 1 --gradient_accumulation_steps 4 --ddp_timeout 900000000 --lr_scheduler_type cosine --logging_steps 1 --cutoff_len 14000 --save_steps 1000 --save_total_limit 100 --plot_loss --num_train_epochs 10 --bf16
请问我应该如何加载DPO训练权重呢?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
作者你好,我冻结冻结vision tower剩下全参数一起DPO训练后, 直接load模型参数
model_org = Qwen2VLForConditionalGeneration.from_pretrained(
model_path,
torch_dtype="auto",
attn_implementation="flash_attention_2",
device_map="auto",
)
直接预测为乱码
我的dpo训练参数是:
--deepspeed examples/deepspeed/ds_z0_config.json
--stage dpo
--do_train
--model_name_or_path 1030_mapgpt_mesh996_patchexpand_refer_norefer_aug_lr2e-5/checkpoint-14000
--dataset mapgpt_refer_dpo
--output_dir workdir/1107_simpo
--learning_rate 1e-5
--template qwen2_vl
--finetuning_type full
--freeze_vision_tower true
--pref_beta 0.1
--pref_loss simpo
--overwrite_cache
--overwrite_output_dir
--warmup_steps 100
--weight_decay 0.1
--preprocessing_num_workers 32
--per_device_train_batch_size 1
--gradient_accumulation_steps 4
--ddp_timeout 900000000
--lr_scheduler_type cosine
--logging_steps 1
--cutoff_len 14000
--save_steps 1000
--save_total_limit 100
--plot_loss
--num_train_epochs 10
--bf16
请问我应该如何加载DPO训练权重呢?
The text was updated successfully, but these errors were encountered: