This GitHub repository has several examples of fine-tuning of open source large language models. It demonstrates how to fine-tune and quantize large language models using performance efficient fine-tuning techniques like Lora and QLora.
Reference -> https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu