[Feature Request] Evaluation tools of the Few-shot VQA/Caption #5

Li-Qingyun · 2024-03-06T11:06:05Z

Hi, I'm interested in your great work.

The ./scripts/v1_5/eval/eval_all.sh is not avalilable now. Could you release the evaluation tools? Especially the few-shot VQA/Caption.

And the mmc4 pretrained weight is wished to be availiable.

dataset_mixture of new_vflan_sharegpt4v_sft is also not availiable.

Ty very much !

The text was updated successfully, but these errors were encountered:

Li-Qingyun · 2024-03-07T00:40:53Z

Thanks for the authors' support.
I found the ./scripts/v1_5/eval/eval_all.sh has been availiable.

The evaluation tools of the Few-shot VQA/Caption is also essential for the researchers following this work. Looking forward to the release of this part.

Ty very much !

Lyken17 · 2024-03-07T08:11:11Z

Hi Qingyun,

Which evaluation scripts you are looking for VQA and caption? Current eval_all.sh should cover all metrics in the paper.

Li-Qingyun · 2024-03-07T11:54:41Z

Hi Qingyun,

Which evaluation scripts you are looking for VQA and caption? Current eval_all.sh should cover all metrics in the paper.

@Lyken17

Thanks for your reply! I'm looking forward to the Few-shot OKVQA/TextVQA/CocoCaption/FlickrCaption in the ablation study of Table 1/3. 🙏🙏
Best Regards.

Li-Qingyun · 2024-03-16T12:01:22Z

@Lyken17 I'm writing to request evaluation tools of the Few-shot VQA/Caption (Specifically, 4-shots OKVQA/TextVQA/CocoCaption/FlickrCaption in the ablation study of VILA Table 1/3).

The experimental results validated that: when used for pre-training Llava-like MLLMs, image-text interleaved data (MMC4) achieves better few-shot VQA/Caption results than image-text pairs data (COYO/LAION...). I tried to eval the Few-shot VQA scores of the open-source VILA-7B weight, but i did not get the same conclusion.

okvqa
0-shot: 61.05
1-shot: 56.93
2-shot: 56.84
4-shot: 56.47
textvqa
0-shot: 62.64
1-shot: 60.73
2-shot: 60.45
4-shot: 60.88

I realize that my implementation may not work for validating the few-shots performance, so I wish you to consider releasing the evaluation tool, since you seem to have become the major contributor of this open source repository.
It will be a great help to my research and I will be very grateful to you.

Best regards,
Qingyun.

Details of my implementation has been sent to your email.

Lyken17 · 2024-03-21T16:33:43Z

cc' @kentang-mit and @Seerkfang who are more familar with evaluation scripts.

Li-Qingyun · 2024-03-22T00:48:28Z

cc' @kentang-mit and @Seerkfang who are more familar with evaluation scripts.

@Lyken17
Okkk, thanks for your reply!

Dear @kentang-mit and @Seerkfang:

Could you please share few-shot evaluation scrips?

It will be a great help to my research and I will be very grateful to you.

In few-show VQA/Caption results of the VILA paper, compared to the decline of image-text pair pre-training, the promotion of interleaved image-text pre-training is an essential reason for VILA to add stage2. Stage2 seems to make SFT model better few-shot learning performance, which can also serve as a rebuttal to the point of #12 .

fix dataset path

Li-Qingyun changed the title ~~[Feature Request] ./scripts/v1_5/eval/eval_all.sh is not availiable~~ [Feature Request] Evaluation tools of the Few-shot VQA/Caption Mar 16, 2024

gheinrich pushed a commit to gheinrich/VILA that referenced this issue Dec 16, 2024

Merge pull request NVlabs#5 from XueFuzhao/fix_dataset_path

190055e

fix dataset path

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Evaluation tools of the Few-shot VQA/Caption #5

[Feature Request] Evaluation tools of the Few-shot VQA/Caption #5

Li-Qingyun commented Mar 6, 2024 •

edited

Loading

Li-Qingyun commented Mar 7, 2024

Lyken17 commented Mar 7, 2024

Li-Qingyun commented Mar 7, 2024 •

edited

Loading

Li-Qingyun commented Mar 16, 2024

Lyken17 commented Mar 21, 2024

Li-Qingyun commented Mar 22, 2024

[Feature Request] Evaluation tools of the Few-shot VQA/Caption #5

[Feature Request] Evaluation tools of the Few-shot VQA/Caption #5

Comments

Li-Qingyun commented Mar 6, 2024 • edited Loading

Li-Qingyun commented Mar 7, 2024

Lyken17 commented Mar 7, 2024

Li-Qingyun commented Mar 7, 2024 • edited Loading

Li-Qingyun commented Mar 16, 2024

Lyken17 commented Mar 21, 2024

Li-Qingyun commented Mar 22, 2024

Li-Qingyun commented Mar 6, 2024 •

edited

Loading

Li-Qingyun commented Mar 7, 2024 •

edited

Loading