Rewrite `peft_integration.md`

This section of the documentation is widely outdated and rely only on PPO.

Ideally, we should have a clear documentation that shows how to use peft with SFT, DPO and GRPO at least, via the `peft_config` argument. We could have additional subsection about QLoRA and prompt-tuning.