generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
4
[Tracking issue] Wrong loss scaling when accumulating gradient
#2617
opened Jan 23, 2025 by
qgallouedec
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
OOM for 7B model on A100 80Gb
🐛 bug
Something isn't working
#2719
opened Jan 31, 2025 by
JohnConnor123
5 tasks done
AttributeError: 'AutoModelForCausalLMWithValueHead' object has no attribute 'base_model_prefix'
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2718
opened Jan 31, 2025 by
Tarak200
GRPO for RL on agent trajectories
🏋 GRPO
Related to GRPO
🏋 Reward
Related to Reward modelling
#2715
opened Jan 31, 2025 by
korbinian-hoermann
Isn't the reward *minimized* when len(completion)==20 if this is the reward function?
🏋 Reward
Related to Reward modelling
#2714
opened Jan 31, 2025 by
cfpark00
GRPO with tool calling
🏋 GRPO
Related to GRPO
🏋 Reward
Related to Reward modelling
#2712
opened Jan 31, 2025 by
accupham
3 tasks
LoRA 'trainable params: 0'
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
#2711
opened Jan 31, 2025 by
shannonruxin
Examples in training VDPO on llava1.6
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2710
opened Jan 31, 2025 by
lucasjinreal
PPOTrainer + LoRA and Continued Training
⏳ needs more info
Additional information or clarification is required to proceed
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2707
opened Jan 30, 2025 by
kooryan
Multi-GPU sampling for vLLM in GRPO Trainer
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2706
opened Jan 30, 2025 by
nch0w
GRPO: Why does loss start at 0 for first K steps and then increase over time?
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2703
opened Jan 30, 2025 by
arnavgarg1
5 tasks done
Exposing GenerationConfig in the GRPO Trainer
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2702
opened Jan 30, 2025 by
Superskyyy
Allow pretokenized dataset in GRPO Trainer
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2701
opened Jan 30, 2025 by
Superskyyy
GRPO VLLM does not work with Lora
🏋 GRPO
Related to GRPO
⚡ PEFT
Related to PEFT
#2698
opened Jan 30, 2025 by
gagan3012
5 tasks done
I cannot launch PPOTrainning script with accelerate launch
⚡accelerate
Related to accelerate
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2696
opened Jan 30, 2025 by
daehuikim
5 tasks done
OOM 8xH100 using latest GRPO code with vLLM
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 GRPO
Related to GRPO
#2688
opened Jan 30, 2025 by
abacaj
5 tasks done
empty Cache after logps_per_token
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2686
opened Jan 29, 2025 by
shirinyamani
rewards_funcs set to eval mode
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
🏋 Reward
Related to Reward modelling
#2685
opened Jan 29, 2025 by
shirinyamani
Support iterative GRPO
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
⚡ PEFT
Related to PEFT
#2684
opened Jan 29, 2025 by
howardzhou
About the Implementation of GRPO
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2681
opened Jan 29, 2025 by
macheng6
Ability to provide a static completion for GRPO
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2680
opened Jan 29, 2025 by
Palmik
logging issue: generation_config in rloo_trainer.py's generate_completions() is not reflective of actual model generations
🐛 bug
Something isn't working
🏋 RLOO
Related to RLOO
#2678
opened Jan 28, 2025 by
swkarlekar
5 tasks done
TypeError: type list doesn't define __round__ method - why I am getting this error
🐛 bug
Something isn't working
⏳ needs more info
Additional information or clarification is required to proceed
🏋 Reward
Related to Reward modelling
#2674
opened Jan 28, 2025 by
Tarak200
"None of the inputs have requires_grad=True" with online DPO and GRPO
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
🏋 Online DPO
Related to Online DPO
#2671
opened Jan 28, 2025 by
benjamin-marie
5 tasks done
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.