-
Notifications
You must be signed in to change notification settings - Fork 2.3k
docs: Unify model examples to use trl-lib namespace #4431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Resolves #4385 - Replace edbeeching/gpt-neo-125M-imdb with trl-lib/Qwen2-0.5B-XPO in peft_integration.md - Replace kashif/stack-llama-2 with trl-lib/Qwen2-0.5B-XPO in use_model.md (3 occurrences) - All personal developer namespace models now use common trl-lib namespace
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
this one is missing: and all of these: trl/examples/scripts/evals/judge_tldr.py Lines 34 to 50 in 6f906d5
but they'll require train and push a model to the org trl-lib |
Address reviewer feedback by replacing trl-lib/Qwen2-0.5B-XPO with the official Qwen/Qwen2.5-0.5B model in all use_model.md examples. Changes: - Replace model references in 3 locations to use Qwen organization model - More consistent with rest of TRL documentation - Less misleading than custom trl-lib namespace model
Update all model references in use_model.md to use Qwen/Qwen3-0.6B as specifically requested by qgallouedec. Changes: - Replace Qwen/Qwen2.5-0.5B with Qwen/Qwen3-0.6B in all 3 locations - Simpler model reference consistent with reviewer's suggestion
|
@qgallouedec I've addressed the comments: ✅ Completed:
⏳ Pending (blocked):
Should we:
|
Summary
Unifies model namespace usage in documentation examples to use the common
trl-libnamespace as requested in issue #4385.Resolves #4385
Changes
Files Modified:
docs/source/peft_integration.md- 1 model reference updateddocs/source/use_model.md- 3 model references updatedReplacements:
edbeeching/gpt-neo-125M-imdb→trl-lib/Qwen2-0.5B-XPOkashif/stack-llama-2→trl-lib/Qwen2-0.5B-XPOAll personal developer namespace models in documentation examples now use the unified
trl-libnamespace. Official organization models (meta-llama, microsoft, google, etc.) and research project references (cleanrl) are intentionally preserved as they serve specific purposes.Verification
trl-lib/Qwen2-0.5B-XPO) is already widely used in TRL documentation