Releases: VectorInstitute/vectorlm
Releases · VectorInstitute/vectorlm
VectorLM 0.1.2
This release implements Low-Rank parameter-efficient fine-tuning (LoRA PEFT) for FSDP-sharded models and adds utilities for reporting the training throughput of the fine-tuning pipeline.
- LoRA fine-tuning of LLMs such as Mixtral 8x7B on 4x A100 80GB through FSDP model parallelism: Torch FSDP splits model weights across GPUs since the model would not fit on one single GPU, whereas LoRA PEFT greatly reduces the GPU memory required for storing optimizer states. To enable FSDP-LoRA, uncomment the LoRA-PEFT section of config.yaml. Refer to the Memory & Compute documentation for more details.
- Benchmarking tools and reference throughput table: To help researchers estimate the resources required for their experiments, we provide reference LLM finetuning training token throughput for a number of models on the Vector Vaughan cluster- both PEFT LoRA and full-rank. We also provide benchmarking tools for testing throughput on other models and environments.
VectorLM v0.1.1
This release fixes a bug in state checkpointing and adds a few features.
- Previously, you would not be able to checkpoint a model being trained with Hybrid FSDP. This version now implements the use of torch's distributed checkpointing submodule for our checkpointing functionality.
- We have enabled forward prefetching of weights in FSDP by default as to maximize communication overlap.
- We have also added functionality for low CPU memory usage (under Memory & Compute) while loading large models. This makes it so that the model weights are loaded onto CPU memory once from the main rank and are scattered appropriately.
VectorLM v0.1.0
This release officially puts out our lightweight package designed to train medium-sized LLMs in communication-bottlenecked environments!
- Support for dense fine-tuning LLMs in single/multi-node settings.
- Added support for several optimization techniques out of the box that were covered.
- Added example scripts for LLaMa model fine-tuning.
As we move forward, we will be additionally providing support for distributed fine-tuning models using PEFT methods!