Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init commit avalon #60

Merged
merged 18 commits into from
Nov 7, 2023
Merged

init commit avalon #60

merged 18 commits into from
Nov 7, 2023

Conversation

HenryCai11
Copy link
Contributor

@HenryCai11 HenryCai11 commented Oct 22, 2023

AvalonBench

Quick Start

Start the task server and the assigner

Start the game (3 is the number of workers)

python -m src.start_task -a --start avalon-dev-single 3

Start the assigner

python -m src.assigner --config ./configs/assignments/test_avalon.yaml

Customize configurations and data

  1. You can modify the file configs/tasks/avalon.yaml to configure the agent list. A config file looks like this:
default:
  module: "src.server.tasks.avalon.AvalonBench"
  parameters:
    num_players: 5
    discussion: False

avalon-dev-naive:
  parameters:
    name: "AvalonBench-dev-naive"
    data_file: "data/avalon/dev.json"
    agent_list: ["naive", "naive", "naive", "naive", "naive"]

avalon-dev-single:
  parameters:
    name: "AvalonBench-dev-single"
    data_file: "data/avalon/dev.json"
    agent_list: ["llm", "naive", "naive", "naive", "naive"]

where naive stands for the naive bots. Agents will play the roles with the same index in the data file (see following).

Note: There should only be one "llm" in the `agent_list`
  1. You can also add data in data/avalon/dev.json (Note: Currently we only support the 5-player game setting, which includes 1 Merlin, 2 Servants, 1 Minion and 1 Assassin). A data item looks like this:
 {
     "num_players": 5,
     "quest_leader": 0,
     "role_names": ["Assassin", "Servant", "Servant", "Merlin", "Minion"]
 }

where quest_leader is the id of the initial quest leader in this game. You can change the game setup by altering quest_leader with number from 0 to 4, and by permuting role_names.

Naive experiment

You can also start a naive experiment using:

python -m src.start_task -a --start avalon-dev-naive 3

where all the agents are naive bots. For details of the naive strategies, please refer to the paper.

Prompts

All the prompts are maintained in src/server/tasks/avalon/prompt.py. You can find the respective prompts in src/server/tasks/avalon/agents/llm_with_discussion.py and src/server/tasks/avalon/wrapper.py.

Results

Results of single-setting games

{
    "total": 20,
    "validation": {
        "running": 0.0,
        "completed": 0.95,
        "agent context limit": 0.0,
        "agent validation failed": 0.05,
        "agent invalid action": 0.0,
        "task limit reached": 0.0,
        "unknown": 0.0,
        "task error": 0.0,
        "average_history_length": 11.0,
        "max_history_length": 14,
        "min_history_length": 2
    },
    "custom": {
        "Win rate of Player 0": 0.15,
        "Avg deduction acc of Player 0": 0.5399999999999998,
        "Valid number of games": 19
    }
}

@Longin-Yu Longin-Yu merged commit adc728e into THUDM:main Nov 7, 2023
@Xiao9905 Xiao9905 mentioned this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants