Commit 43b6541
Support completion bootstrap for VLM in GRPO/RLOO (#4452)
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>1 parent 642b721 commit 43b6541
File tree
4 files changed
+12
-0
lines changed- trl
- experimental
- gfpo
- grpo_with_replay_buffer
- trainer
4 files changed
+12
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
209 | 212 | | |
210 | 213 | | |
211 | 214 | | |
| |||
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
210 | 210 | | |
211 | 211 | | |
212 | 212 | | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
213 | 216 | | |
214 | 217 | | |
215 | 218 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1575 | 1575 | | |
1576 | 1576 | | |
1577 | 1577 | | |
| 1578 | + | |
| 1579 | + | |
| 1580 | + | |
1578 | 1581 | | |
1579 | 1582 | | |
1580 | 1583 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1342 | 1342 | | |
1343 | 1343 | | |
1344 | 1344 | | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
1345 | 1348 | | |
1346 | 1349 | | |
1347 | 1350 | | |
| |||
0 commit comments