Release v0.3.1 (Llama 3.2 Vision patch) · pytorch/torchtune

Overview

We've added full support for Llama 3.2 after it was announced, and this includes full/LoRA fine-tuning on the Llama3.2-1B, Llama3.2-3B base and instruct text models and Llama3.2-11B-Vision base and instruct text models. This means we now support the full end-to-end development of VLMs - fine-tuning, inference, and eval! We've also included a lot more goodies in a few short weeks:

Llama 3.2 1B/3B/11B Vision configs for full/LoRA fine-tuning
Updated recipes to support VLMs
Multimodal eval via EleutherAI
Support for torch.compile for VLMs
Revamped generation utilities for multimodal support + batched inference for text only
New knowledge distillation recipe with configs for Llama3.2 and Qwen2
Llama 3.1 405B QLoRA fine-tuning on 8xA100s
MPS support (beta) - you can now use torchtune on Mac!

New Features

Models

QLoRA with Llama 3.1 405B (#1232)
Llama 3.2 (#1679, #1688, #1661)

Multimodal

Update recipes for multimodal support (#1548, #1628)
Multimodal eval via EleutherAI (#1669, #1660)
Multimodal compile support (#1670)
Exportable multimodal models (#1541)

Generation

Revamped generate recipe with multimodal support (#1559, #1563, #1674, #1686)
Batched inference for text-only models (#1424, #1449, #1603, #1622)

Knowledge Distillation

Add single device KD recipe and configs for Llama 3.2, Qwen2 (#1539, #1690)

Memory and Performance

Compile FFT FSDP (#1573)
Apply rope on k earlier for efficiency (#1558)
Streaming offloading in (q)lora single device (#1443)

Quantization

Update quantization to use tensor subclasses (#1403)
Add int4 weight-only QAT flow targeting tinygemm kernel (#1570)

RLHF

Adding generic preference dataset builder (#1623)

Miscellaneous

Add drop_last to dataloader (#1654)
Add low_cpu_ram config to qlora (#1580)
MPS support (#1706)

Documentation

nits in memory optimizations doc (#1585)
Tokenizer and prompt template docs (#1567)
Latexifying IPOLoss docs (#1589)
modules doc updates (#1588)
More doc nits (#1611)
update docs (#1602)
Update llama3 chat tutorial (#1608)
Instruct and chat datasets docs (#1571)
Preference dataset docs (#1636)
Messages and message transforms docs (#1574)
Readme Updates (#1664)
Model transform docs (#1665)
Multimodal dataset builder + docs (#1667)
Datasets overview docs (#1668)
Update README.md (#1676)
Readme updates for Llama 3.2 (#1680)
Add 3.2 models to README (#1683)
Knowledge distillation tutorial (#1698)
Text completion dataset docs (#1696)

Quality-of-Life Improvements

Set possible resolutions to debug, not info (#1560)
Remove TiedEmbeddingTransformerDecoder from Qwen (#1547)
Make Gemma use regular TransformerDecoder (#1553)
llama 3_1 instantiate pos embedding only once (#1554)
Run unit tests against PyTorch nightlies as part of our nightly CI (#1569)
Support load_dataset kwargs in other dataset builders (#1584)
add fused = true to adam, except pagedAdam (#1575)
Move RLHF out of modules (#1591)
Make logger only log on rank0 for Phi3 loading errors (#1599)
Move rlhf tests out of modules (#1592)
Update PR template (#1614)
Update get_unmasked_sequence_lengths example 4 release (#1613)
remove ipo loss + small fixed (#1615)
Fix dora configs (#1618)
Remove unused var in generate (#1612)
remove deprecated message (#1619)
Fix qwen2 config (#1620)
Proper names for dataset types (#1625)
Make q optional in sample (#1637)
Rename JSONToMessages to OpenAIToMessages (#1643)
update gemma to ignore gguf (#1655)
Add Pillow >= 9.4 requirement (#1671)
guard import (#1684)
add upgrade to pip command (#1687)
Do not run CI on forked repos (#1681)

Bug Fixes

Fix flex attention test (#1568)
Add eom_id to Llama3 Tokenizer (#1586)
Only merge model weights in LoRA recipe when save_adapter_weights_only=False (#1476)
Hotfix eval recipe (#1594)
Fix typo in PPO recipe (#1607)
Fix lora_dpo_distributed recipe (#1609)
Fixes for MM Masking and Collation (#1601)
delete duplicate LoRA dropout fields in DPO configs (#1583)
Fix tune download command in PPO config (#1593)
Fix tune run not identifying custom components (#1617)
Fix compile error in get_causal_mask_from_padding_mask (#1627)
Fix eval recipe bug for group tasks (#1642)
Fix basic tokenizer no special tokens (#1640)
add BlockMask to batch_to_device (#1651)
Fix PACK_TYPE import in collate (#1659)
Fix llava_instruct_dataset (#1658)
convert rgba to rgb (#1678)

New Contributors (auto-generated by GitHub)

@dvorjackz made their first contribution (#1558)

Full Changelog: v0.3.0...v0.3.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.1 (Llama 3.2 Vision patch)