🗂 Update paper_index section #3937

behroozazarkhalili · 2025-08-21T21:37:06Z

Summary

This PR updates the paper index documentation to include additional research papers and their corresponding implementation configurations within the TRL package.

Changes

Added new paper entries with proper references and configuration examples
Included implementation details and parameter settings for reproducibility
Enhanced documentation to help users understand and reproduce training configurations

This update expands the paper index to provide better coverage of research implementations available in TRL, making it easier for users to find and use relevant configurations for their research and experiments.

qgallouedec · 2025-08-22T02:37:45Z

Can you also mention that we don't support dynamic sampling

- Added DAPO (An Open-Source LLM Reinforcement Learning System at Scale) section - Includes proper paper reference and implementation details - Added training configuration parameters from DAPO paper section 4.1

behroozazarkhalili · 2025-08-22T13:21:40Z

Can you also mention that we don't support dynamic sampling

Done. I also added mask_truncated_completions = True to the configuration.

- Added Dr. GRPO configuration example with training parameters - Includes paper reference and implementation details from training section - Added parameters: loss_type, batch_size, num_generations, prompt/completion lengths, beta

docs/source/paper_index.md

…-paper-index

docs/source/paper_index.md

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

qgallouedec

it's looking gooooood

HuggingFaceDocBuilderDev · 2025-08-22T19:07:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Co-authored-by: behroozazarkhalili <ermiaazarkhalili> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

behroozazarkhalili force-pushed the update-paper-index branch from 455f543 to b9859c6 Compare August 21, 2025 23:24

Update paper_index section with DAPO entry

7c4665a

- Added DAPO (An Open-Source LLM Reinforcement Learning System at Scale) section - Includes proper paper reference and implementation details - Added training configuration parameters from DAPO paper section 4.1

behroozazarkhalili force-pushed the update-paper-index branch from b9859c6 to 7c4665a Compare August 22, 2025 13:19

behroozazarkhalili and others added 3 commits August 22, 2025 06:36

Add Dr. GRPO section to paper index

98efe1a

- Added Dr. GRPO configuration example with training parameters - Includes paper reference and implementation details from training section - Added parameters: loss_type, batch_size, num_generations, prompt/completion lengths, beta

reorder

7a11d81

style

1f99446

qgallouedec reviewed Aug 22, 2025

View reviewed changes

docs/source/paper_index.md Show resolved Hide resolved

qgallouedec and others added 3 commits August 22, 2025 17:39

style

12aca2a

Merge branch 'main' of https://github.com/huggingface/trl into update…

b2d31f5

…-paper-index

Add Soft Overlong Punishment configuration example to DAPO section

22ab9d1

qgallouedec reviewed Aug 22, 2025

View reviewed changes

docs/source/paper_index.md Outdated Show resolved Hide resolved

qgallouedec reviewed Aug 22, 2025

View reviewed changes

docs/source/paper_index.md Outdated Show resolved Hide resolved

behroozazarkhalili and others added 3 commits August 22, 2025 11:51

Add DPO (Direct Preference Optimization) section to paper index

bc337d0

Update docs/source/paper_index.md

0a5583a

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Update docs/source/paper_index.md

d9b40e8

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

qgallouedec approved these changes Aug 22, 2025

View reviewed changes

qgallouedec changed the title ~~Update paper_index section~~ 🗂 Update paper_index section Aug 22, 2025

qgallouedec merged commit 181a841 into huggingface:main Aug 22, 2025
1 check passed

qgallouedec mentioned this pull request Oct 30, 2025

Complete paper index #4407

Open

54 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🗂 Update paper_index section #3937

🗂 Update paper_index section #3937

Uh oh!

behroozazarkhalili commented Aug 21, 2025

Uh oh!

qgallouedec commented Aug 22, 2025

Uh oh!

behroozazarkhalili commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🗂 Update paper_index section #3937

🗂 Update paper_index section #3937

Uh oh!

Conversation

behroozazarkhalili commented Aug 21, 2025

Summary

Changes

Uh oh!

qgallouedec commented Aug 22, 2025

Uh oh!

behroozazarkhalili commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants