Skip to content

Releases: mosaicml/llm-foundry

v0.14.5

18 Nov 17:15
Compare
Choose a tag to compare
  • Move transform_model_pre_registration in hf_checkpointer (#1664)

Full Changelog: v0.14.4...v0.14.5

v0.14.4

07 Nov 20:42
Compare
Choose a tag to compare
  • Add max shard size to transformers save_pretrained by @b-chu in #1648

Full Changelog: v0.14.3...v0.14.4

v0.14.3

05 Nov 15:41
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.14.2...v0.14.3

v0.14.2

04 Nov 02:14
Compare
Choose a tag to compare

Bug Fixes

Move loss generating token counting to the dataloader (#1632)

Fixes a throughput regression due to #1610, which was release in v0.14.0

What's Changed

  • Move loss generating token counting to the dataloader by @dakinggg in #1632

Full Changelog: v0.14.1...v0.14.2

v0.14.1

01 Nov 23:55
Compare
Choose a tag to compare

New Features

Use log_model for registering models (#1544 )

Instead of calling the mlflow register API directly, we use the intended log_model API, which will both log the model to mlflow run artifacts, and register it to Unity Catalog.

What's Changed

Full Changelog: v0.14.0...v0.14.1

v0.14.0

28 Oct 22:41
Compare
Choose a tag to compare

New Features

Load Checkpoint Callback (#1570)

We added support for Composer's LoadCheckpoint callback, which loads a checkpoint at a specified event. This enables use cases like loading model base weights with peft.

callbacks:
    load_checkpoint:
        load_path: /path/to/your/weights

Breaking Changes

Accumulate over tokens in a Batch for Training Loss (#1618,#1610,#1595)

We added a new flag accumulate_train_batch_on_tokens which specifies whether training loss is accumulated over the number of tokens in a batch, rather than the number of samples. It is true by default. This will slightly change loss curves for models trained with padding. The old behavior can be recovered by simply setting this to False explicitly.

Default Run Name (#1611)

If no run name is provided, we now will default to using composer's randomly generated run names. (Previously, we defaulted to using "llm" for the run name.)

What's Changed

Full Changelog: v0.13.0...v0.14.0

v0.13.1

18 Oct 16:50
Compare
Choose a tag to compare

🚀 LLM Foundry v0.13.1

What's Changed

  • Add configurability to HF checkpointer timeout by @dakinggg in #1599

Full Changelog: v0.13.0...v0.13.1

v0.13.0

15 Oct 06:23
Compare
Choose a tag to compare

🚀 LLM Foundry v0.13.0

🛠️ Bug Fixes & Cleanup

Pytorch 2.4 Checkpointing (#1569, #1581, #1583)

Resolved issues related to checkpointing for Curriculum Learning (CL) callbacks.

🔧 Dependency Updates

Bumped tiktoken from 0.4.0 to 0.8.0 (#1572)
Updated onnxruntime from 1.19.0 to 1.19.2 (#1590)

What's Changed

Full Changelog: v0.12.0...v0.13.0

v0.12.0

26 Sep 03:52
Compare
Choose a tag to compare

🚀 LLM Foundry v0.12.0

New Features

PyTorch 2.4 (#1505)

This release updates LLM Foundry to the PyTorch 2.4 release, bringing with it support for the new features and optimizations in PyTorch 2.4

Extensibility improvements (#1450, #1449, #1468, #1467, #1478, #1493, #1495, #1511, #1512, #1527)

Numerous improvements to the extensibility of the modeling and data loading code, enabling easier reuse for subclassing and extending. Please see the linked PRs for more details on each change.

Improved error messages (#1457, #1459, #1519, #1518, #1522, #1534, #1548, #1551)

Various improved error messages, making debugging user errors more clear.

Sliding window in torch attention (#1455)

We've added support for sliding window attention to the reference attention implementation, allowing easier testing and comparison against more optimized attention variants.

Bug fixes

Extra BOS token for llama 3.1 with completion data (#1476)

A bug resulted in an extra BOS token being added between prompt and response during finetuning. This is fixed so that the prompt and response supplied by the user are concatenated without any extra tokens put between them.

What's Changed

New Contributors

Full Changelog: v0.11.0...v0.12.0

v0.11.0

13 Aug 17:16
Compare
Choose a tag to compare

🚀 LLM Foundry v0.11.0

New Features

LLM Foundry CLI Commands (#1337, #1345, #1348, #1354)

We've added CLI commands for our commonly used scripts.

For example, instead of calling composer llm-foundry/scripts/train.py parameters.yaml, you can now do composer -c llm-foundry train parameters.yaml.

Docker Images Contain All Optional Dependencies (#1431)

LLM Foundry Docker images now have all optional dependencies.

Support for Llama3 Rope Scaling (#1391)

To use it, you can add the following to your parameters:

model:
    name: mpt_causal_lm
    attn_config:
      rope: true
      ...
      rope_impl: hf
      rope_theta: 500000
      rope_hf_config:
        type: llama3
        ...

Tokenizer Registry (#1386)

We now have a tokenizer registry so you can easily add custom tokenizers.

LoadPlanner and SavePlanner Registries (#1358)

We now have LoadPlanner and SavePlanner registries so you can easily add custom checkpoint loading and saving logic.

Faster Auto-packing (#1435)

The auto packing startup is now much faster. To use auto packing with finetuning datasets, you can add packing_ratio: auto to your config like so:

  train_loader:
    name: finetuning
    dataset:
      ...
      packing_ratio: auto

What's Changed

Read more