Make agent delegation be able to be specified in benchmarks

Currently in benchmarks, we don't support sub-agent delegation. 

https://github.com/OpenHands/benchmarks/blob/680ce0f564174aecd74394ef083bd421e6dbe5e1/benchmarks/swtbench/run_infer.py#L260-L265

We'd like to support this and try, e.g. SWE-Bench with agent delegation on.

	# TODO: we can enable condenser and security analyzer later
	# and have them configurable via EvalMetadata
	# condenser=get_default_condenser(
	# llm=self.metadata.llm.model_copy(update={"service_id": "condenser"})
	# ),
	# security_analyzer=LLMSecurityAnalyzer(),

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make agent delegation be able to be specified in benchmarks #411

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Make agent delegation be able to be specified in benchmarks #411

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions