Reproduce SEED LLaMA evaluation #33

hyomin14 · 2024-03-25T06:28:43Z

Thanks for your great work.

I have a question related to SEED-LLAMA evaluation settings.
I tried to reproduce the VQA accuracy of instruction tuned SEED-LLaMA 8B on VQAv2 dataset but i cannot reproduce results in paper (66.2).

I tried on 8x A100 80GB gpu and 1 batch size.
This is the generation config i used.

generation_config = {
        'temperature': 1.0,
        'num_beams': 1,
        'max_new_tokens': 64,
        'top_p': 0.5,
        'do_sample': True
    }

And this is the result calculated by official evaluation website.
"test-dev": {"yes/no": 38.59, "number": 23.68, "other": 39.1, "overall": 37.14}

It would be thankful if you can provide your evaluation settings or some advice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduce SEED LLaMA evaluation #33

Reproduce SEED LLaMA evaluation #33

hyomin14 commented Mar 25, 2024

Reproduce SEED LLaMA evaluation #33

Reproduce SEED LLaMA evaluation #33

Comments

hyomin14 commented Mar 25, 2024