v0.9.0: Benchmarks! 🎉
New features
- Benchmarks with default config (tasks x seeds) and metadata #173 #191
from browsergym.experiments import BENCHMARKS, Benchmark # make a custom benchmark benchmark = Benchmark( name="miniwob_click_test", high_level_action_set_args=HighLevelActionSetArgs( subsets=["bid"], multiaction=False, strict=False, retry_with_force=False, demo_mode="off", ), env_args_list=[ EnvArgs( task_name="miniwob.click-test", task_seed=42, max_steps=5, ) ], ) # use a pre-existing benchmark miniwob = BENCHMARKS["miniwob_all"]() # use only a task subset miniwob_original = miniwob.subset_from_glob( column="miniwob_category", glob="original" )
- New playwright key modifier "ControlOrMeta" #187
- Global demo_mode flag #177
import browsergym.core.action browsergym.core.action.set_global_demo_mode(True) # boolean
Fixes
- Multi-tab actions fix #188
Full Changelog: v0.8.1...v0.9.0