fix: initialize default metadata with all required fields #6583

xingyaoww · 2025-02-02T05:01:12Z

This PR fixes an issue where the evaluation script would error out if metadata.json doesn't exist. Now it will initialize a default EvalMetadata with all required fields when the file is missing.

Changes:

Added a fallback to create default metadata when metadata.json doesn't exist
Initializes EvalMetadata with all required fields:
- agent_class: "dummy_agent" (placeholder)
- llm_config: LLMConfig with "dummy_model"
- max_iterations: 1
- eval_output_dir: input file directory
- start_time: current time
- git_commit: current commit hash
- dataset: from args.dataset
Maintains type safety by ensuring metadata is always an EvalMetadata instance

To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:543d0ba-nikolaik   --name openhands-app-543d0ba   docker.all-hands.dev/all-hands-ai/openhands:543d0ba

…ta.json doesn't exist

csmith49

LGTM.

Is it worth moving the defaults to the fallbacks in EvalMetadata so they're shared across all benchmarks? Looks pretty specific to SWE-bench.

xingyaoww · 2025-02-03T18:52:04Z

Is it worth moving the defaults to the fallbacks in EvalMetadata so they're shared across all benchmarks?

I think most other benchmarks won't have "eval_infer.py" which is pretty special/specific to SWE-Bench (e.g., only doing patch evaluation, but not inference), so I think having this being special case for SWE-Bench is probably ok?

…AI#6583) Co-authored-by: openhands <openhands@all-hands.dev>

fix: initialize default metadata with all required fields when metada…

543d0ba

…ta.json doesn't exist

xingyaoww marked this pull request as ready for review February 2, 2025 05:02

xingyaoww requested review from csmith49 and neubig February 2, 2025 05:02

csmith49 approved these changes Feb 3, 2025

View reviewed changes

xingyaoww merged commit 90bbd4e into main Feb 3, 2025
14 checks passed

xingyaoww deleted the fix/default-eval-metadata branch February 3, 2025 18:52

zchn pushed a commit to zchn/OpenHands that referenced this pull request Feb 4, 2025

fix: initialize default metadata with all required fields (All-Hands-…

35f6c6f

…AI#6583) Co-authored-by: openhands <openhands@all-hands.dev>

adityasoni9998 pushed a commit to adityasoni9998/OpenHands that referenced this pull request Feb 7, 2025

fix: initialize default metadata with all required fields (All-Hands-…

b7e16d4

…AI#6583) Co-authored-by: openhands <openhands@all-hands.dev>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: initialize default metadata with all required fields #6583

fix: initialize default metadata with all required fields #6583

xingyaoww commented Feb 2, 2025 •

edited by github-actions bot

Loading

csmith49 left a comment

xingyaoww commented Feb 3, 2025

fix: initialize default metadata with all required fields #6583

fix: initialize default metadata with all required fields #6583

Conversation

xingyaoww commented Feb 2, 2025 • edited by github-actions bot Loading

csmith49 left a comment

Choose a reason for hiding this comment

xingyaoww commented Feb 3, 2025

xingyaoww commented Feb 2, 2025 •

edited by github-actions bot

Loading