Skip to content

Actions: kashif/trl

Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
139 workflow runs
139 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

🚧 Add Optional ZeRO-3 Weight Gathering for GRPO in Sequence Generatio…
Tests #154: Commit af4ad47 pushed by kashif
February 5, 2025 09:12 33m 49s main
February 5, 2025 09:12 33m 49s
💔 Decouple loss computing and generation in GRPO (#2762)
Tests #153: Commit 1f344c9 pushed by kashif
February 4, 2025 14:26 29m 18s main
February 4, 2025 14:26 29m 18s
🥞 Fix KTO gradient accumulation loss scaling (#2648)
Tests #152: Commit 6f99f42 pushed by kashif
January 24, 2025 16:55 27m 42s main
January 24, 2025 16:55 27m 42s
💾 Reduce memory peak in GRPO by adding max_prompt_length and loop u…
Tests #151: Commit b6a084c pushed by kashif
January 21, 2025 15:38 25m 35s main
January 21, 2025 15:38 25m 35s
🧰 Tool fine-tuning support DPO (#2479)
Tests #150: Commit d9f0568 pushed by kashif
January 21, 2025 08:25 23m 55s main
January 21, 2025 08:25 23m 55s
[RLOO] fix token_level_kl (#2575)
Tests #149: Commit 1b1140a pushed by kashif
January 17, 2025 15:24 23m 37s main
January 17, 2025 15:24 23m 37s
✨ Refine model card method docstring (#2566)
Tests #148: Commit 57d9a97 pushed by kashif
January 16, 2025 14:01 25m 39s main
January 16, 2025 14:01 25m 39s
[RLOO] Reinforce++ (#2552)
Tests #147: Commit edabe0a pushed by kashif
January 9, 2025 11:37 23m 3s main
January 9, 2025 11:37 23m 3s
💔 Fix dataset type unpair conversion docs (#2550)
Tests #146: Commit abfffc5 pushed by kashif
January 8, 2025 19:26 23m 28s main
January 8, 2025 19:26 23m 28s
January 1, 2025 10:48 23m 53s
🗂️ Reorganize documentation (#2483)
Tests #144: Commit 9908dda pushed by kashif
December 19, 2024 09:06 22m 21s main
December 19, 2024 09:06 22m 21s
🏞️ Proper dataset for documentation images (#2499)
Tests #143: Commit 5e204e1 pushed by kashif
December 18, 2024 15:04 22m 54s main
December 18, 2024 15:04 22m 54s
📥 Fix missing BitsAndBytesConfig import in doc (#2478)
Tests #142: Commit 117c6d4 pushed by kashif
December 15, 2024 16:11 23m 43s main
December 15, 2024 16:11 23m 43s
☄️ Add support for Comet experiment management SDK integration (#2462)
Tests #141: Commit 6d4ed07 pushed by kashif
December 15, 2024 09:34 22m 49s main
December 15, 2024 09:34 22m 49s
Update modeling_base.py (#2419)
Tests #140: Commit 148b592 pushed by kashif
December 12, 2024 10:21 22m 9s main
December 12, 2024 10:21 22m 9s
🗝️ Update type hints (#2399)
Tests #139: Commit c10cc89 pushed by kashif
November 27, 2024 10:08 21m 13s main
November 27, 2024 10:08 21m 13s
🐢 Fix slow tests (#2397)
Tests #138: Commit 9368dcc pushed by kashif
November 26, 2024 16:45 20m 57s main
November 26, 2024 16:45 20m 57s
🧳 Move zen generation script and fix tests (#2393)
Tests #137: Commit 43df3a4 pushed by kashif
November 26, 2024 13:13 22m 6s main
November 26, 2024 13:13 22m 6s
Update log method to include start_time parameter (#2381)
Tests #136: Commit 672c965 pushed by kashif
November 22, 2024 09:17 25m 10s main
November 22, 2024 09:17 25m 10s
Fix dev install (#2369)
Tests #135: Commit 066fc37 pushed by kashif
November 19, 2024 17:37 6h 0m 25s main
November 19, 2024 17:37 6h 0m 25s
📉 Add PEFT support for PPOTrainer (#2344)
Tests #134: Commit 1293f37 pushed by kashif
November 18, 2024 11:05 25m 39s main
November 18, 2024 11:05 25m 39s
🔮 Inference mode in GeometricMixtureWrapper.forward (#2345)
Tests #133: Commit 21d5baf pushed by kashif
November 18, 2024 09:21 22m 25s main
November 18, 2024 09:21 22m 25s
⚖️ Add use_soft_judge option to WinRateCallback (#2347)
Tests #132: Commit b8c9d9c pushed by kashif
November 18, 2024 08:41 23m 21s main
November 18, 2024 08:41 23m 21s
DPO trainer supports num_logits_to_keep to save memory (#2129)
Tests #131: Commit 0238d96 pushed by kashif
November 11, 2024 12:09 22m 45s main
November 11, 2024 12:09 22m 45s
🧮 Fix the computation of KL divergence loss (#2277)
Tests #130: Commit ea7a1be pushed by kashif
October 26, 2024 19:02 19m 53s main
October 26, 2024 19:02 19m 53s