[NAACL 2025] Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction [Paper]

ShengbinYue*, Ting Huang*, Zheng Jia*, Siyuan Wang, Shujun Liu, Yun Song, Xuanjing Huang, Zhongyu Wei

The Multi-agent Legal Simulation Driver (MASER), a legal-specific simulator that serves as data-generation engine, empowering arbitrary LLMs with intensive interaction capabilities.

In this repository, we will release:

Multi-agent Legal Simulation Driver (MASER)
The constructed SynthLaw-4.5k Dataset.
Training scripts.
Multi-Stage Interactive Legal Evaluation (MILE) Benchmark.

MASER

Environment Setup

To set up your environment, run the following command:

pip install -r requirements.txt

Profile data Constrction

The agent profiles setup with Big-5 Personality Traits and Real Legal Source. You can follow this step to obtain the data by prompt the GPT-4o.

You can get the final processed data here

Run MASER

Navigate to the source directory:

cd ./src

Before running the script, open scripts/run.sh and enter your API keys for the required services. For instance:

For OpenAI Models (e.g., GPT-4): OPENAI_API_KEY="", OPENAI_API_BASE=""

Execute the script with:

bash scripts/run.sh

You can find the dialog history documents of LLMs featured at Dialog_History.

Training

Using MASER, we construct a high-quality synthetic legal scene dataset, SynthLaw-4.5k.

[
  {
    "id": 1,
    "system": "你是一位专业且经验丰富的律师...",
    "input": "请根据上述与用户的对话历史，参照给定的起...",
    "instruction": "请根据上述与用户的对话历史...",
    "history": [
      ["您好，我想写一份起诉状", "好的，没问题！我这边需要先问您一些问题，了解一下相关情况。"],
      ["当然可以，我已经准备好了。请问您想先了解哪个方面？", "我们先从您的基本信息开始吧。请问您的姓名、性别、出生日期、民族和地址是什么？"],
      ......
    ]
  }
]

To fine-tuning LLM on SynthLaw-4.5k, you can refer to LLaMA Factory. First, download LLaMA Factory and follow its instructions to install the required dependencies. Note that the training data should be processed in the Supervised Fine-Tuning Dataset format specified by the project.

cd train
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"

Take the Qwen2.5-7B-Instruct as example, use the following 2 commands to run LoRA fine-tuning and merging, respectively.

llamafactory-cli train ../qwen_lora_sft.yaml
llamafactory-cli export ../merge_qwen_lora.yaml

MILE Benchmark

Multi-Stage Interactive Legal Evaluation (MILE) introduces an approach for assessing the model’s ability to complete designated legal tasks in a dynamic environment. Leveraging powerful LLM to simulate the non-legal characters (i.e., Client), MILE thoroughly evaluates the performance of LLMs driven lawyer within this dynamic legal interaction environment. MILE is divided into two phases: interaction evaluation and goal evaluation.

Please see evaluation for details about the evaluation datasets and evaluation scripts.

Citation

We encourage the use of our code and data in your research and kindly request citation of our paper as follows:

@article{yue2025multi,
  title={Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction},
  author={Yue, Shengbin and Huang, Ting and Jia, Zheng and Wang, Siyuan and Liu, Shujun and Song, Yun and Huang, Xuanjing and Wei, Zhongyu},
  journal={arXiv preprint arXiv:2502.06882},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[NAACL 2025] Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction [Paper]

Content

MASER

Environment Setup

Profile data Constrction

Run MASER

Training

MILE Benchmark

Citation

About

Releases

Packages

Contributors 2

Languages

FudanDISC/MASER

Folders and files

Latest commit

History

Repository files navigation

[NAACL 2025] Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction [Paper]

Content

MASER

Environment Setup

Profile data Constrction

Run MASER

Training

MILE Benchmark

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages