Release v0.9.0: Benchmarks! 🎉 · ServiceNow/BrowserGym

New features

Benchmarks with default config (tasks x seeds) and metadata #173 #191

from browsergym.experiments import BENCHMARKS, Benchmark

# make a custom benchmark
benchmark = Benchmark(
  name="miniwob_click_test",
  high_level_action_set_args=HighLevelActionSetArgs(
    subsets=["bid"],
    multiaction=False,
    strict=False,
    retry_with_force=False,
    demo_mode="off",
  ),
  env_args_list=[
    EnvArgs(
      task_name="miniwob.click-test",
      task_seed=42,
      max_steps=5,
   )
  ],
)

# use a pre-existing benchmark
miniwob = BENCHMARKS["miniwob_all"]()

# use only a task subset
miniwob_original = miniwob.subset_from_glob(
 column="miniwob_category", glob="original"
)

New playwright key modifier "ControlOrMeta" #187

Global demo_mode flag #177

import browsergym.core.action

browsergym.core.action.set_global_demo_mode(True)  # boolean

Fixes

Multi-tab actions fix #188

Full Changelog: v0.8.1...v0.9.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.0: Benchmarks! 🎉

New features

Fixes