Skip to content

Remove double setting of timeout default in swebench eval_infer.py#417

Open
simonrosenberg wants to merge 1 commit intomainfrom
fix/remove-double-default-timeout
Open

Remove double setting of timeout default in swebench eval_infer.py#417
simonrosenberg wants to merge 1 commit intomainfrom
fix/remove-double-default-timeout

Conversation

@simonrosenberg
Copy link
Collaborator

Summary

This PR removes the double setting of the timeout default value in benchmarks/swebench/eval_infer.py.

Similar to PR #399, this fixes a case where a default value was being set twice:

  1. In the add_argument() call with default=EVAL_DEFAULTS["timeout"]
  2. In parser.set_defaults(**EVAL_DEFAULTS)

Changes

benchmarks/swebench/eval_infer.py

  • Removed default=EVAL_DEFAULTS["timeout"] from the --timeout argument definition
  • Updated the help text to remove the redundant default value display
  • Updated the comment to reflect that timeout is now included in the defaults applied by set_defaults()

Consistency with Other Benchmarks

This change aligns with the pattern used by other benchmarks in the repository where defaults are only set via parser.set_defaults():

  • swebenchmultimodal/eval_infer.py - uses only parser.set_defaults(**EVAL_DEFAULTS)
  • gaia/run_infer.py - uses only parser.set_defaults(**INFER_DEFAULTS)
  • commit0/run_infer.py - uses only parser.set_defaults(**INFER_DEFAULTS)
  • swtbench/run_infer.py - uses only parser.set_defaults(**INFER_DEFAULTS)

Testing

All 35 existing tests pass, confirming the refactoring doesn't break any functionality.

@simonrosenberg can click here to continue refining the PR

The timeout argument had its default set twice:
1. In the add_argument() call with default=EVAL_DEFAULTS['timeout']
2. In parser.set_defaults(**EVAL_DEFAULTS)

This removes the redundant default from add_argument() since
set_defaults() already applies all EVAL_DEFAULTS including timeout.

Co-authored-by: openhands <openhands@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants