-
Notifications
You must be signed in to change notification settings - Fork 212
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Result] Update Evaluation Results (#60)
* update MME, SEEDBench * update results * update LLaVABench * fix * update AI2D accuracy * update LLaVABench * update README * update teaser link
- Loading branch information
1 parent
e992046
commit 493a7e8
Showing
12 changed files
with
363 additions
and
244 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# AI2D Evaluation Results | ||
|
||
> During evaluation, we use `GPT-3.5-Turbo-0613` as the choice extractor for all VLMs if the choice can not be extracted via heuristic matching. **Zero-shot** inference is adopted. | ||
## AI2D Accuracy | ||
|
||
| Model | overall | | ||
|:----------------------------|----------:| | ||
| Monkey-Chat | 72.6 | | ||
| GPT-4v (detail: low) | 71.3 | | ||
| Qwen-VL-Chat | 68.5 | | ||
| Monkey | 67.6 | | ||
| GeminiProVision | 66.7 | | ||
| QwenVLPlus | 63.7 | | ||
| Qwen-VL | 63.4 | | ||
| LLaVA-InternLM2-20B (QLoRA) | 61.4 | | ||
| CogVLM-17B-Chat | 60.3 | | ||
| ShareGPT4V-13B | 59.3 | | ||
| TransCore-M | 59.2 | | ||
| LLaVA-v1.5-13B (QLoRA) | 59 | | ||
| LLaVA-v1.5-13B | 57.9 | | ||
| ShareGPT4V-7B | 56.7 | | ||
| InternLM-XComposer-VL | 56.1 | | ||
| LLaVA-InternLM-7B (QLoRA) | 56 | | ||
| LLaVA-v1.5-7B (QLoRA) | 55.2 | | ||
| mPLUG-Owl2 | 55.2 | | ||
| SharedCaptioner | 55.1 | | ||
| IDEFICS-80B-Instruct | 54.4 | | ||
| LLaVA-v1.5-7B | 54.1 | | ||
| PandaGPT-13B | 49.2 | | ||
| LLaVA-v1-7B | 47.8 | | ||
| IDEFICS-9B-Instruct | 42.7 | | ||
| InstructBLIP-7B | 40.2 | | ||
| VisualGLM | 40.2 | | ||
| InstructBLIP-13B | 38.6 | | ||
| MiniGPT-4-v1-13B | 33.4 | | ||
| OpenFlamingo v2 | 30.7 | | ||
| MiniGPT-4-v2 | 29.4 | | ||
| MiniGPT-4-v1-7B | 28.7 | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.