Skip to content

Conversation

@Kyunnilee
Copy link

Before you open a pull-request, please check if a similar issue already exists or has been closed before.

When you open a pull-request, please be sure to include the following

  • A descriptive title: [xxx] XXXX
  • A detailed description

If you meet the lint warnings, you can use following scripts to reformat code.

pip install pre-commit
pre-commit install
pre-commit run --all-files

Thank you for your contributions!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra line should be removed

Comment on lines +32 to +43
def hb_doc_to_visual(doc):
"""Convert document to visual input."""
num_image = int(os.environ.get("NUM_IMAGE", "1"))

if num_image == 1:
# print("one image!")
return [doc["image"].convert("RGB")]
elif num_image == 2:
# print("two images!")
return [doc["image"].convert("RGB"), doc["image"].convert("RGB")]
else:
raise ValueError(f"num_image must be 1 or 2, got {num_image}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask is the replicated images necessary for evaluating hallusionbench?

Copy link
Collaborator

@kcz358 kcz358 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for your contribution! I've seen that you are added some image first options in simple models such as llava ov and vllm. Though the changes are acceptable, you are more recommended to create a doc_to_messages and use the chat model for auto formatting.

Another question is that I have seen you are replicate images for many benchmarks. May I ask what are these used for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants