Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Training Agents with GRPO 🏋 GRPO Related to GRPO
#2723 opened Jan 31, 2025 by August-murr
OOM for 7B model on A100 80Gb 🐛 bug Something isn't working
#2719 opened Jan 31, 2025 by JohnConnor123
5 tasks done
GRPO for RL on agent trajectories 🏋 GRPO Related to GRPO 🏋 Reward Related to Reward modelling
#2715 opened Jan 31, 2025 by korbinian-hoermann
GRPO with tool calling 🏋 GRPO Related to GRPO 🏋 Reward Related to Reward modelling
#2712 opened Jan 31, 2025 by accupham
3 tasks
LoRA 'trainable params: 0' 🐛 bug Something isn't working ⚡ PEFT Related to PEFT
#2711 opened Jan 31, 2025 by shannonruxin
Examples in training VDPO on llava1.6 🏋 DPO Related to DPO ✨ enhancement New feature or request
#2710 opened Jan 31, 2025 by lucasjinreal
GRPO memory bottleneck from num_generations in compute_loss 🐛 bug Something isn't working 🏋 GRPO Related to GRPO ⚡ PEFT Related to PEFT
#2709 opened Jan 31, 2025 by willccbb
PPOTrainer + LoRA and Continued Training ⏳ needs more info Additional information or clarification is required to proceed ⚡ PEFT Related to PEFT 🏋 PPO Related to PPO
#2707 opened Jan 30, 2025 by kooryan
Multi-GPU sampling for vLLM in GRPO Trainer ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2706 opened Jan 30, 2025 by nch0w
GRPO: Why does loss start at 0 for first K steps and then increase over time? 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2703 opened Jan 30, 2025 by arnavgarg1
5 tasks done
Exposing GenerationConfig in the GRPO Trainer ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2702 opened Jan 30, 2025 by Superskyyy
Allow pretokenized dataset in GRPO Trainer ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2701 opened Jan 30, 2025 by Superskyyy
GRPO VLLM does not work with Lora 🏋 GRPO Related to GRPO ⚡ PEFT Related to PEFT
#2698 opened Jan 30, 2025 by gagan3012
5 tasks done
I cannot launch PPOTrainning script with accelerate launch ⚡accelerate Related to accelerate ⚡ PEFT Related to PEFT 🏋 PPO Related to PPO
#2696 opened Jan 30, 2025 by daehuikim
5 tasks done
OOM 8xH100 using latest GRPO code with vLLM 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 GRPO Related to GRPO
#2688 opened Jan 30, 2025 by abacaj
5 tasks done
empty Cache after logps_per_token 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2686 opened Jan 29, 2025 by shirinyamani
rewards_funcs set to eval mode 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information 🏋 Reward Related to Reward modelling
#2685 opened Jan 29, 2025 by shirinyamani
Support iterative GRPO ✨ enhancement New feature or request 🏋 GRPO Related to GRPO ⚡ PEFT Related to PEFT
#2684 opened Jan 29, 2025 by howardzhou
About the Implementation of GRPO 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2681 opened Jan 29, 2025 by macheng6
Ability to provide a static completion for GRPO ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2680 opened Jan 29, 2025 by Palmik
TypeError: type list doesn't define __round__ method - why I am getting this error 🐛 bug Something isn't working ⏳ needs more info Additional information or clarification is required to proceed 🏋 Reward Related to Reward modelling
#2674 opened Jan 28, 2025 by Tarak200
"None of the inputs have requires_grad=True" with online DPO and GRPO 🐛 bug Something isn't working 🏋 GRPO Related to GRPO 🏋 Online DPO Related to Online DPO
#2671 opened Jan 28, 2025 by benjamin-marie
5 tasks done
ProTip! Find all open issues with in progress development work with linked:pr.