Skip to content

Commit

Permalink
add upd
Browse files Browse the repository at this point in the history
  • Loading branch information
AtsuMiyai committed May 29, 2024
1 parent 909edd6 commit 453e793
Show file tree
Hide file tree
Showing 37 changed files with 172 additions and 172 deletions.
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,19 +190,19 @@ We also provide the raw data exported from Weights & Biases for the detailed res
- MMMU (mmmu)
- MMMU Validation (mmmu_val)
- MMMU Test (mmmu_test)
- MMUPDBench (mmupdbench)
- MMUPDBench Base (mmupdbench_base)
- MMAADBench Base (mmaadbench_base)
- MMIASDBench Base (mmiasdbench_base)
- MMIVQDBench Base (mmivqdbench_base)
- MMUPDBench Option (mmupdbench_option)
- MMAADBench Option (mmaadbench_option)
- MMIASDBench Option (mmiasdbench_option)
- MMIVQDBench Option (mmivqdbench_option)
- MMUPDBench Instruction (mmupdbench_instruction)
- MMAADBench Instruction (mmaadbench_instruction)
- MMIASDBench Instruction (mmiasdbench_instruction)
- MMIVQDBench Instruction (mmivqdbench_instruction)
- MMUPD (mmupd)
- MMUPD Base (mmupd_base)
- MMAAD Base (mmaad_base)
- MMIASD Base (mmiasd_base)
- MMIVQD Base (mmivqd_base)
- MMUPD Option (mmupd_option)
- MMAAD Option (mmaad_option)
- MMIASD Option (mmiasd_option)
- MMIVQD Option (mmivqd_option)
- MMUPD Instruction (mmupd_instruction)
- MMAAD Instruction (mmaad_instruction)
- MMIASD Instruction (mmiasd_instruction)
- MMIVQD Instruction (mmivqd_instruction)
- MMVet (mmvet)
- Multi-DocVQA (multidocvqa)
- Multi-DocVQA Validation (multidocvqa_val)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\n"
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\nIf all the options are incorrect, answer \"F. None of the above\"."
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\nAnswer with the option's letter from the given choices directly."
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\n"
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\nIf all the options are incorrect, answer \"F. None of the above\"."
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\nAnswer with the option's letter from the given choices directly."
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\n"
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\nIf the given image is irrelevant to the question, answer \"F. The image and question are irrelevant.\"."
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ model_specific_prompt_kwargs:
default:
pre_prompt: ""
post_prompt: "\nAnswer with the option's letter from the given choices directly."
doc_to_visual: !function utils.mmupdbench_doc_to_visual
doc_to_text: !function utils.mmupdbench_doc_to_text
doc_to_visual: !function utils.mmupd_doc_to_visual
doc_to_text: !function utils.mmupd_doc_to_text
doc_to_target: "answer"
process_results: !function utils.mmupdbench_process_results
process_results: !function utils.mmupd_process_results
model_specific_generation_kwargs:
llava:
image_aspect_ratio: original
Expand Down
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmaad_base.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmaad_base"
test_split: test
include: _default_template_mmaad_base_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmaad_base
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmaad_instruction.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmaad_instruction"
test_split: test
include: _default_template_mmaad_instruction_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmaad_instruction
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmaad_option.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmaad_option"
test_split: test
include: _default_template_mmaad_option_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmaad_option
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmiasd_base.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmiasd_base"
test_split: test
include: _default_template_mmiasd_base_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmiasd_base
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmiasd_instruction.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmiasd_instruction"
test_split: test
include: _default_template_mmiasd_instruction_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmiasd_instruction
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmiasd_option.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmiasd_option"
test_split: test
include: _default_template_mmiasd_option_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmiasd_option
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmivqd_base.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmivqd_base"
test_split: test
include: _default_template_mmivqd_base_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmivqd_base
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmivqd_instruction.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmivqd_instruction"
test_split: test
include: _default_template_mmivqd_instruction_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmivqd_instruction
higher_is_better: true
7 changes: 7 additions & 0 deletions lmms_eval/tasks/mmupd/mmivqd_option.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
task: "mmivqd_option"
test_split: test
include: _default_template_mmivqd_option_yaml
metric_list:
- metric: gpt_eval_score
aggregation: !function utils.mmivqd_option
higher_is_better: true
15 changes: 15 additions & 0 deletions lmms_eval/tasks/mmupd/mmupd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
group: mmupd
task:
- mmaad_base
- mmaad_option
- mmaad_instruction
- mmiasd_base
- mmiasd_option
- mmiasd_instruction
- mmivqd_base
- mmivqd_option
- mmivqd_instruction
metadata:
version: 0.0
sys_prompt: ""
gpt_eval_model_name: "gpt-3.5-turbo-0613"
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
group: mmupdbench_base
group: mmupd_base
task:
- mmaadbench_base
- mmiasdbench_base
- mmivqdbench_base
- mmaad_base
- mmiasd_base
- mmivqd_base
metadata:
version: 0.0
sys_prompt: ""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ def load_tsv(f):
return handlers[suffix](f)


class MMUPDBench_Evaluator:
class MMUPD_Evaluator:
def __init__(self, sys_prompt="There are several options:", API_KEY="", API_URL="", model_version="gpt-3.5-turbo-0613"):
self.sys_prompt = sys_prompt
self.model_version = model_version
Expand Down
9 changes: 9 additions & 0 deletions lmms_eval/tasks/mmupd/mmupd_instruction.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
group: mmupd_instruction
task:
- mmaad_instruction
- mmiasd_instruction
- mmivqd_instruction
metadata:
version: 0.0
sys_prompt: ""
gpt_eval_model_name: "gpt-3.5-turbo-0613"
9 changes: 9 additions & 0 deletions lmms_eval/tasks/mmupd/mmupd_option.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
group: mmupd_option
task:
- mmaad_option
- mmiasd_option
- mmivqd_option
metadata:
version: 0.0
sys_prompt: ""
gpt_eval_model_name: "gpt-3.5-turbo-0613"
Loading

0 comments on commit 453e793

Please sign in to comment.