Improving Attributed Text Generation of Large Language Models via Preference Learning

Overview

Automatic Preference Optimization(APO) framework includes an automatic preference construction procedure and a fine-grained preference optimization procedure, which is used to handle attributed text generation tasks.

Dataset

The dataset is available here.

Environment

conda create -n attri_trl python=3.10
conda activate attri_trl
pip install -r requirements.txt

Step1: Post-training

Using data from post_training.

The training scripts is provided in post_training/scripts.

finetune.sh is used to train base model.

finetune_ablation.sh is used to perform continue post-training after the post-training precedure having finished (ablation study).

Step2: Perference Optimization

Using data from preference_optimization.jsonl

The training scripts is provided in po/scripts.

Step3: Evaluation

We use the ALCE official evaluation scripts alongwith custom prompts. The evaluation scripts can be found in evaluation_alce.

Citation

@inproceedings{DBLP:conf/acl/LiSHLH0Z24,
  author       = {Dongfang Li and
                  Zetian Sun and
                  Baotian Hu and
                  Zhenyu Liu and
                  Xinshuo Hu and
                  Xuebo Liu and
                  Min Zhang},
  editor       = {Lun{-}Wei Ku and
                  Andre Martins and
                  Vivek Srikumar},
  title        = {Improving Attributed Text Generation of Large Language Models via
                  Preference Learning},
  booktitle    = {Findings of the Association for Computational Linguistics, {ACL} 2024,
                  Bangkok, Thailand and virtual meeting, August 11-16, 2024},
  pages        = {5079--5101},
  publisher    = {Association for Computational Linguistics},
  year         = {2024},
  url          = {https://aclanthology.org/2024.findings-acl.301},
  timestamp    = {Tue, 27 Aug 2024 17:38:11 +0200},
  biburl       = {https://dblp.org/rec/conf/acl/LiSHLH0Z24.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Attributed Text Generation of Large Language Models via Preference Learning

Overview

Dataset

Environment

Step1: Post-training

Step2: Perference Optimization

Step3: Evaluation

Citation

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
evaluation_alce		evaluation_alce
po		po
post_training		post_training
readme.md		readme.md
requirements.txt		requirements.txt

HITsz-TMG/ATG-PO

Folders and files

Latest commit

History

Repository files navigation

Improving Attributed Text Generation of Large Language Models via Preference Learning

Overview

Dataset

Environment

Step1: Post-training

Step2: Perference Optimization

Step3: Evaluation

Citation

About

Resources

Stars

Watchers

Forks

Languages