GitHub - chenkangyang/visualbert: Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

测试集结果文件md5

链接：https://pan.baidu.com/s/1uFOFjj5QdCvaCw-8MYTFhA 提取码：jpmn --来自百度网盘超级会员V5的分享

(base) ➜  visualbert git:(kk) ✗ md5sum logs/vqa_finetune_0831/result.json 
22e0a4ce40f0ec92d8fa730b697101cd  logs/vqa_finetune_0831/result.json

下载提交到竞赛网站上的 Submitted file, 同样检查md5

(base) ➜  visualbert git:(kk) ✗ md5sum logs/vqa_finetune_0831/result.json 
22e0a4ce40f0ec92d8fa730b697101cd  logs/vqa_finetune_0831/result.json
(base) ➜  visualbert git:(kk) ✗ wget https://evalai.s3.amazonaws.com/media/submission_files/submission_156749/2ae1c2c6-7f72-4dd6-994a-b0adf5867983.json--2021-09-13 16:01:56--  https://evalai.s3.amazonaws.com/media/submission_files/submission_156749/2ae1c2c6-7f72-4dd6-994a-b0adf5867983.json
Resolving evalai.s3.amazonaws.com... 52.217.14.60
Connecting to evalai.s3.amazonaws.com|52.217.14.60|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20517922 (20M) [application/json]
Saving to: ‘2ae1c2c6-7f72-4dd6-994a-b0adf5867983.json’

2ae1c2c6-7f72-4dd6-994a-b0adf5867983.json       100%[======================================================================================================>]  19.57M  1.70MB/s    in 17s     

2021-09-13 16:02:30 (1.15 MB/s) - ‘2ae1c2c6-7f72-4dd6-994a-b0adf5867983.json’ saved [20517922/20517922]

(base) ➜  visualbert git:(kk) ✗ md5sum 2ae1c2c6-7f72-4dd6-994a-b0adf5867983.json 
22e0a4ce40f0ec92d8fa730b697101cd  2ae1c2c6-7f72-4dd6-994a-b0adf5867983.json

This repository contains code for the following two papers:

VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short version titiled What Does BERT with Vision Look At? published on ACL 2020.

Under the folder visualbert is code (the original VisualBERT), where we pre-train a Transformer for vision-and-language (V&L) tasks on image-caption data.
Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions published on NAACL 2021.

Under the folder unsupervised_visualbert is code (Unsupervised VisualBERT), where we pre-train a V&L transformer without aligned image-captions pairs. Rather, we pre-training only using unaligned images and text, and achieve competitive performance with many models supervised with aligned data.

The model VisualBERT has been also integrated into several libararies such as Huggingface Transformer (many thanks to Gunjan Chhablani who made it work) and Facebook MMF.

Thanks~

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.vscode		.vscode
logs		logs
unsupervised_visualbert		unsupervised_visualbert
visualbert		visualbert
.gitignore		.gitignore
README.md		README.md
X_COCO		X_COCO
X_NLVR		X_NLVR
vqa-fine-tune.json		vqa-fine-tune.json
vqa_finetune_visualbert.sh		vqa_finetune_visualbert.sh
vqa_pretrain_visualbert.sh		vqa_pretrain_visualbert.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

chenkangyang/visualbert

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages