v0.10.0: AssistantBench! 🎉
New features
- New BrowserGym benchmark AssistantBench, packaged as
browsergym-assistantbench
. Thanks @oriyor ! #186import browsergym.assistantbench env = gym.make("browsergym/assistantbench.validation.12") env = gym.make("browsergym/assistantbench.test.42")
- Default train/test splits for all benchmarks
miniwob = DEFAULT_BENCHMARKS["miniwob"] # 125 tasks x 5 seeds miniwob_train = miniwob.subset_from_split("train") # 62 tasks x 5 seeds miniwob_test = miniwob.subset_from_split("test") # 63 tasks x 5 seeds
Breaking Changes
Fixes
- Improved experiment logging #182
Full Changelog: v0.9.0...v0.10.0