Apply suggestions from code review

hmellor · web-flow · commit 56bebc561b41 · 2025-09-26T20:24:07.000+01:00
Signed-off-by: Harry Mellor &lt;19981378+hmellor@users.noreply.github.com&gt;
diff --git a/docs/training/trl.md b/docs/training/trl.md
@@ -21,10 +21,9 @@ To enable vLLM in TRL, set the `use_vllm` flag in the trainer configuration to `
 
 Some trainers also support **vLLM sleep mode**, which offloads parameters and caches to GPU RAM during training, helping reduce memory usage. Learn more in the [memory optimization docs](https://huggingface.co/docs/trl/main/en/reducing_memory_usage#vllm-sleep-mode).
 
-
-
 !!! info
     For more information on the `use_vllm` flag you can provide to the configs of these online methods, see:
+
     - [`trl.GRPOConfig.use_vllm`](https://huggingface.co/docs/trl/main/en/grpo_trainer#trl.GRPOConfig.use_vllm)
     - [`trl.OnlineDPOConfig.use_vllm`](https://huggingface.co/docs/trl/main/en/online_dpo_trainer#trl.OnlineDPOConfig.use_vllm)
     - [`trl.RLOOConfig.use_vllm`](https://huggingface.co/docs/trl/main/en/rloo_trainer#trl.RLOOConfig.use_vllm)