Releases · logic-star-ai/swt-bench

06 Mar 08:43

1.2.0

5d6e00e

Release 1.2.0 - Reproductions script mode Latest

Latest

This release adds a "reproduction script" mode for SWT-Bench. In this mode (which was leveraged by i.e. AEGIS), the test is not required to fit into the unit test framework of the repository but can be a standalone script. We compute coverage delta as usual and count a non-zero exit code of the script as failing and a zero exit code as passing. In this setting, it is not possible to adversely affect other test cases in the framework.

What's Changed

Add reproduction script mode by @nielstron in #20

Full Changelog: 1.1.0...1.2.0

Contributors

nielstron

Assets 2

04 Mar 06:21

nielstron

1.1.0

5c85ecf

Release 1.1.0 - SWT-Bench Verified

This release transfers a number of further patches that have been reported useful in SWE-Bench and adds support for SWT-Bench Verified, obtained with the same quality criteria as SWT-Bench Lite.

We released SWT-Bench Verified and published the three best performing methods of SWT-Bench Lite as baselines on our website.

What's Changed

Fix sklearn constants by @nielstron in #11
Reproduce docker image fixes (pinning versions) from SWE-Bench by @nielstron in #12
Add leaderboard website and submission instructions by @nielstron in #14
Add SWT-Verified by @nielstron in #18

Full Changelog: 1.0.1...1.1.0

Contributors

nielstron

Assets 2

16 Nov 10:46

nielstron

1.0.1

f42d9fe

Release 1.0.1 - Patch instances

What's Changed

Fix building django images by @zyone1991 in #6
Run install and test on several Python versions by @nielstron in #7

New Contributors

@zyone1991 made their first contribution in #6

Full Changelog: 1.0.0...1.0.1

Contributors

nielstron and zyone1991

Assets 2

01 Nov 16:18

nielstron

1.0.0

dce4aee

Release 1.0.0 - Initial Release

This version is the original code of the Neurips Published paper "SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents"

Full Changelog: https://github.com/logic-star-ai/swt-bench/commits/1.0.0

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

Releases: logic-star-ai/swt-bench

Release 1.2.0 - Reproductions script mode

What's Changed

Contributors

Release 1.1.0 - SWT-Bench Verified

What's Changed

Contributors

Release 1.0.1 - Patch instances

What's Changed

New Contributors

Contributors

Release 1.0.0 - Initial Release