Skip to content

v0.3.1 (Llama 3.2 Vision patch)

Latest
Compare
Choose a tag to compare
@RdoubleA RdoubleA released this 02 Oct 21:26
· 118 commits to main since this release

Overview

We've added full support for Llama 3.2 after it was announced, and this includes full/LoRA fine-tuning on the Llama3.2-1B, Llama3.2-3B base and instruct text models and Llama3.2-11B-Vision base and instruct text models. This means we now support the full end-to-end development of VLMs - fine-tuning, inference, and eval! We've also included a lot more goodies in a few short weeks:

  • Llama 3.2 1B/3B/11B Vision configs for full/LoRA fine-tuning
  • Updated recipes to support VLMs
  • Multimodal eval via EleutherAI
  • Support for torch.compile for VLMs
  • Revamped generation utilities for multimodal support + batched inference for text only
  • New knowledge distillation recipe with configs for Llama3.2 and Qwen2
  • Llama 3.1 405B QLoRA fine-tuning on 8xA100s
  • MPS support (beta) - you can now use torchtune on Mac!

New Features

Models

Multimodal

  • Update recipes for multimodal support (#1548, #1628)
  • Multimodal eval via EleutherAI (#1669, #1660)
  • Multimodal compile support (#1670)
  • Exportable multimodal models (#1541)

Generation

Knowledge Distillation

  • Add single device KD recipe and configs for Llama 3.2, Qwen2 (#1539, #1690)

Memory and Performance

  • Compile FFT FSDP (#1573)
  • Apply rope on k earlier for efficiency (#1558)
  • Streaming offloading in (q)lora single device (#1443)

Quantization

  • Update quantization to use tensor subclasses (#1403)
  • Add int4 weight-only QAT flow targeting tinygemm kernel (#1570)

RLHF

  • Adding generic preference dataset builder (#1623)

Miscellaneous

  • Add drop_last to dataloader (#1654)
  • Add low_cpu_ram config to qlora (#1580)
  • MPS support (#1706)

Documentation

  • nits in memory optimizations doc (#1585)
  • Tokenizer and prompt template docs (#1567)
  • Latexifying IPOLoss docs (#1589)
  • modules doc updates (#1588)
  • More doc nits (#1611)
  • update docs (#1602)
  • Update llama3 chat tutorial (#1608)
  • Instruct and chat datasets docs (#1571)
  • Preference dataset docs (#1636)
  • Messages and message transforms docs (#1574)
  • Readme Updates (#1664)
  • Model transform docs (#1665)
  • Multimodal dataset builder + docs (#1667)
  • Datasets overview docs (#1668)
  • Update README.md (#1676)
  • Readme updates for Llama 3.2 (#1680)
  • Add 3.2 models to README (#1683)
  • Knowledge distillation tutorial (#1698)
  • Text completion dataset docs (#1696)

Quality-of-Life Improvements

  • Set possible resolutions to debug, not info (#1560)
  • Remove TiedEmbeddingTransformerDecoder from Qwen (#1547)
  • Make Gemma use regular TransformerDecoder (#1553)
  • llama 3_1 instantiate pos embedding only once (#1554)
  • Run unit tests against PyTorch nightlies as part of our nightly CI (#1569)
  • Support load_dataset kwargs in other dataset builders (#1584)
  • add fused = true to adam, except pagedAdam (#1575)
  • Move RLHF out of modules (#1591)
  • Make logger only log on rank0 for Phi3 loading errors (#1599)
  • Move rlhf tests out of modules (#1592)
  • Update PR template (#1614)
  • Update get_unmasked_sequence_lengths example 4 release (#1613)
  • remove ipo loss + small fixed (#1615)
  • Fix dora configs (#1618)
  • Remove unused var in generate (#1612)
  • remove deprecated message (#1619)
  • Fix qwen2 config (#1620)
  • Proper names for dataset types (#1625)
  • Make q optional in sample (#1637)
  • Rename JSONToMessages to OpenAIToMessages (#1643)
  • update gemma to ignore gguf (#1655)
  • Add Pillow >= 9.4 requirement (#1671)
  • guard import (#1684)
  • add upgrade to pip command (#1687)
  • Do not run CI on forked repos (#1681)

Bug Fixes

  • Fix flex attention test (#1568)
  • Add eom_id to Llama3 Tokenizer (#1586)
  • Only merge model weights in LoRA recipe when save_adapter_weights_only=False (#1476)
  • Hotfix eval recipe (#1594)
  • Fix typo in PPO recipe (#1607)
  • Fix lora_dpo_distributed recipe (#1609)
  • Fixes for MM Masking and Collation (#1601)
  • delete duplicate LoRA dropout fields in DPO configs (#1583)
  • Fix tune download command in PPO config (#1593)
  • Fix tune run not identifying custom components (#1617)
  • Fix compile error in get_causal_mask_from_padding_mask (#1627)
  • Fix eval recipe bug for group tasks (#1642)
  • Fix basic tokenizer no special tokens (#1640)
  • add BlockMask to batch_to_device (#1651)
  • Fix PACK_TYPE import in collate (#1659)
  • Fix llava_instruct_dataset (#1658)
  • convert rgba to rgb (#1678)

New Contributors (auto-generated by GitHub)

Full Changelog: v0.3.0...v0.3.1