Skip to content

leigao97/minimal-llama

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Minimal Implementation for Llama2 Inference and LoRA Fine-Tuning

This repository provides a simple and minimal implementation for performing inference and Low-Rank Adaptation (LoRA) fine-tuning on Llama2-7B models (need 40GB GPU memory). It is designed with minimal dependencies (only torch and sentencepiece) to provide a straightforward setup.

Download Model and Tokenizer

Install Required Dependencies

pip install torch sentencepiece

Run Inference

python inference.py --tokenizer_path /path_to/tokenizer.model --model_path /path_to/consolidated.00.pth

Run LoRA Fine-tuning

We use Alpaca dataset with only 200 samples for quick experimentation. LoRA implmenetation is under the llama folder.

python finetune.py --tokenizer_path /path_to/tokenizer.model --model_path /path_to/consolidated.00.pth --data_path alpaca_data_200_samples.json

Reference

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%