Scope
- Shared infrastructure for adversarial interpretability across competitive games (chess, diplomacy, etc.).
- Common evals, visualization, data reports, model wrappers, and commands to run training jobs
Non-goals
- Forcing similar model architectures or training code across all experiments.
- Over-abstracting before concrete use cases exist.
- Standardised analysis for all experiements - just want some consistency in final presentation (plots/tables)
Directory layout
- docs/
- environments/
- chess_probe/
- libs/
- evals/
- visualization/
- configs/
- examples/
- scripts/
Shared libraries
- evals/: engine-eval delta, Elo, deception metrics (PR/recall), cost tracking.
- visualization/: plotting helpers and experiment dashboards.
- probes/: soft-token and residual injection modules with small, clear APIs.
- engines/: thin wrappers for Stockfish/Lc0 or other evaluators.
Runners
- TRL PPO (single-turn for chess-probe) with probe-only optimization.
- Verifier- or agent-tooling adapters for multi-step environments.
Results and experiment tracking
- Location: write all outputs under
results/<env>/<experiment_name>/<YYYYMMDD_HHMMSS-<run_id>>/.- Example:
results/chess_probe/probe_ablation/20250115_142530-a1b2c3/
- Example:
- Contents inside a run directory:
config.yaml(or.toml): exact configuration used for the run (copied from--configor auto-dumped resolved config)metadata.json: immutable run metadata- git commit, branch, dirty flag; user, host; Python/CUDA versions; random seeds
- full invocation (command, args,
PYTHONPATH), environment name, library versions (optionallypip freeze)
logs/: captured stdout/stderr/wandbplots/: generated figures for quick inspectionartifacts/: model/probe checkpoints and large outputs (consider symlinks or pointer files if we need to store stuff elsewhere)samples/: qualitative samples (games, traces, prompts/responses)metrics/: summary metrics from experiment
- Script conventions (strongly recommended):
--config path/to/config.yamland--experiment-name <slug>--output-dir results/(default) so scripts create the full run path automatically--notes "short freeform note"saved inmetadata.json- On startup: create the run directory, copy the config, write
metadata.json - During training/eval: append metrics to
metrics.jsonl, write plots and artifacts under the run directory
- Remote trackers: optionally mirror metrics to W&B or MLflow, but the filesystem record above is the source of truth for reproducibility.
Index and discovery
- An append-only index is maintained at
results/index/runs_index.jsonlfor fast discovery. - New runs are auto-indexed:
- On entry via
gamescope.libs.run_utils.capture_metadata()or therun_context(...)context manager (writes astartevent) - On exit via
gamescope.libs.run_utils.mark_status()(writes anendevent with exit reason)
- On entry via
- Artifact usage can be logged to surface interesting runs:
- Call
gamescope.libs.run_utils.mark_artifact_used(path_to_artifact, reason="...") - This writes
<run_dir>/artifacts/USED_BY.jsonland anartifact_usedevent in the index
- Call
CLI helpers
- List runs (non-junk by default, grouped by script, newest first; includes duration and usage counts):
uv run python scripts/find_run.py --results-root results- Backfill the index for existing runs:
uv run python scripts/reindex_runs.py --results-root resultsRun any experiment from a YAML file; a fresh run directory is created and the full config is recorded.
uv run python scripts/config_runner.py --config configs/examples/my_eval.yamlYAML shape:
command: environments/chess_probe/scripts/eval_qwen_bc.py
args:
model_name_or_path: Qwen/Qwen3-8B-Base
num_eval_data: 200
results_dir: results/chess_probe
save_jsonl: trueThe runner injects run_dir for downstream scripts (available as --run_dir if supported, otherwise in env as RUN_DIR).
Add a new environment
- Create environments/<env_name>/ with a README.md describing assumptions and dependencies.
- Reuse libs/ components where possible; avoid environment-specific logic in libs/ (or, if you need evals and they'd be relevant to multiple experiments, create them in libs/).
- Provide example configs under configs/examples/ to run your experiments.
- Add/modify scripts under scripts/ to run your experiment and collect results.
Licensing
- Preserve third-party licenses and headers. See THIRD_PARTY_NOTICES.md.
- Install uv (Linux/macOS):
curl -LsSf https://astral.sh/uv/install.sh | sh
uv syncThis creates a local virtual environment (e.g., .venv/) and installs the base project dependencies.
- Run scripts using the synced environment:
uv run python scripts/your_script.py --help- If you define optional extras for your environment, include them at run time:
uv run --with '.[your_extra]' python scripts/your_script.py ...Notes
uv syncis only needed after changing dependencies or on first setup. For ephemeral runs without a full sync, you may also useuv runwhich will resolve and execute in a temporary environment.