Skip to content

v0.10.0: AssistantBench! 🎉

Compare
Choose a tag to compare
@github-actions github-actions released this 23 Oct 14:50
· 38 commits to main since this release

New features

  • New BrowserGym benchmark AssistantBench, packaged as browsergym-assistantbench. Thanks @oriyor ! #186
    import browsergym.assistantbench
    
    env = gym.make("browsergym/assistantbench.validation.12")
    env = gym.make("browsergym/assistantbench.test.42")
  • Default train/test splits for all benchmarks
    miniwob = DEFAULT_BENCHMARKS["miniwob"]  # 125 tasks x 5 seeds
    miniwob_train = miniwob.subset_from_split("train")  # 62 tasks x 5 seeds
    miniwob_test = miniwob.subset_from_split("test")  # 63 tasks x 5 seeds

Breaking Changes

  • Various updates and refactors to the new Benchmark class #197 #198 #199

Fixes

  • Improved experiment logging #182

Full Changelog: v0.9.0...v0.10.0