Skip to content

qizhust/cmcl_vqa

Repository files navigation

This repo is an implementation in PyTorch and supports METER, ViLT and ALBEF backbones.

Data preparation and pretrained models

Please follow METER and ViLT to prepare the datasets and download the pretrained checkpoints released by corresponding backbones.

Finetune on VQA data

train with single-node multi-gpu

python -m torch.distributed.launch --nproc_per_node=1 main.py \
    --vision_encoder clip16 --kwargs \
    num_workers=4 \
    data_root=path/to/datasets/in/arrows/ \
    per_gpu_batchsize=8 \
    exp_name=finetune_vqa \
    resume_from=result/official_released/meter_clip16_288_roberta_pretrain.ckpt \
    load_pretrained_meter=1 \
    draw_false_image_vqa=1 \
    model_type=cmcl \
    cl_type=nce_graph \
    nce_lambda=0.5 \
    test_only=0 \
    dataset_name=vqacp \
    max_epochs=10

train with multi-node multi-gpu

python main_dist.py --dist-url tcp://$server_ip:$port_id --world-size $num_machines --rank 0 --multiprocessing-distributed \
    --vision_encoder clip16 --kwargs \
    num_workers=4 \
    data_root=path/to/datasets/in/arrows/ \
    per_gpu_batchsize=8 \
    exp_name=finetune_vqa \
    resume_from=result/official_released/meter_clip16_288_roberta_pretrain.ckpt \
    load_pretrained_meter=1 \
    draw_false_image_vqa=1 \
    model_type=cmcl \
    cl_type=nce_graph \
    nce_lambda=0.5 \
    test_only=0 \
    dataset_name=vqacp \
    max_epochs=10

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages