From 25efd829620689b6418bcf1cd7809cca2372b8c9 Mon Sep 17 00:00:00 2001 From: Susant Date: Mon, 3 Nov 2025 12:28:16 +0530 Subject: [PATCH 1/2] docs: add KTO (2402.01306) to Paper Index + link ref to KTOTrainer --- docs/source/paper_index.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/docs/source/paper_index.md b/docs/source/paper_index.md index 8e140630f62..2083a7d737a 100644 --- a/docs/source/paper_index.md +++ b/docs/source/paper_index.md @@ -457,6 +457,19 @@ training_args = DPOConfig( These parameters only appear in the [published version](https://aclanthology.org/2025.tacl-1.22.pdf) +## Kahneman–Tversky Optimization + +Papers relating to the [`KTOTrainer`] + +### KTO: Model Alignment as Prospect Theoretic Optimization + +**📜 Paper**: https://huggingface.co/papers/2402.01306 + +KTO derives an alignment objective from prospect theory and learns directly from **binary** human feedback (liked/disliked), matching or surpassing DPO-style methods while handling imbalanced/noisy signals well. +**Used in TRL via:** [`KTOTrainer`] + + + ## Supervised Fine-Tuning Papers relating to the [`SFTTrainer`] From 7e787bd5579455154da5b94c2ec8cd67ba4e533c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Quentin=20Gallou=C3=A9dec?= Date: Sat, 22 Nov 2025 20:45:12 +0000 Subject: [PATCH 2/2] code example --- docs/source/paper_index.md | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/docs/source/paper_index.md b/docs/source/paper_index.md index 2083a7d737a..7246cde568d 100644 --- a/docs/source/paper_index.md +++ b/docs/source/paper_index.md @@ -465,10 +465,24 @@ Papers relating to the [`KTOTrainer`] **📜 Paper**: https://huggingface.co/papers/2402.01306 -KTO derives an alignment objective from prospect theory and learns directly from **binary** human feedback (liked/disliked), matching or surpassing DPO-style methods while handling imbalanced/noisy signals well. -**Used in TRL via:** [`KTOTrainer`] +KTO derives an alignment objective from prospect theory and learns directly from **binary** human feedback (liked/disliked), matching or surpassing DPO-style methods while handling imbalanced/noisy signals well. +To reproduce the paper's setting, you can use the default configuration of [`KTOTrainer`]: +```python +from trl import KTOConfig, KTOTrainer +from transformers import AutoModelForCausalLM, AutoTokenizer + +model = AutoModelForCausalLM.from_pretrained(model_id) +tokenizer = AutoTokenizer.from_pretrained(model_id) +trainer = KTOTrainer( + model=model, + processing_class=tokenizer, + args=KTOConfig(), + train_dataset=..., +) +trainer.train() +``` ## Supervised Fine-Tuning