Extendability refactors #1290

dakinggg · 2024-06-19T06:14:53Z

This PR includes a few changes for increased extendability of the code:

Fixes some typing
Adds slice_attention_mask to MPTBlock
Removes the need for extra imports in configuration_mpt.py just for HF checkpointing
Factors out the attention config validation
Makes the block class overridable on MPTModel
Adds a registry for transforms that can be applied to the TrainConfig

Loss before and after:

llmfoundry/utils/config_utils.py

snarayan21

few comments

llmfoundry/models/layers/blocks.py

llmfoundry/data/finetuning/dataloader.py

llmfoundry/data/text_data.py

llmfoundry/models/mpt/configuration_mpt.py

llmfoundry/models/mpt/modeling_mpt.py

llmfoundry/registry.py

llmfoundry/utils/builders.py

llmfoundry/utils/config_utils.py

llmfoundry/utils/huggingface_hub_utils.py

milocress

Looks good pending GPU test fix

tests/models/layers/test_dmoe.py:135: in test_dmoe
    expert_parallel_group = device_mesh['expert_parallel'].get_group(0)
/usr/lib/python3/dist-packages/composer/trainer/_patch_pytorch.py:1041: in device_mesh__getitem__
    submesh = _mesh_resources.create_child_mesh(self, mesh_dim_names)
E   NameError: name '_mesh_resources' is not defined

dakinggg · 2024-06-20T18:55:39Z

@milocress the GPU test is unrelated. It will be fixed by the next composer release (which is why that test isn't marked as required yet)

dakinggg added 13 commits June 18, 2024 17:11

transforms registry

dd16a98

pc

c67c843

precommit

ee23b72

fix tests

456bdcc

refactor attn config checks

48fb671

pc

e0ef83c

fixes

dc7169a

pc

0ffd73b

fix circular import and remove extra imports from config

986f1bb

pc

8b943a8

one more required import

4fa728a

typing

d5e5fed

tests and slicing

834066d

dakinggg commented Jun 20, 2024

View reviewed changes

llmfoundry/utils/config_utils.py Show resolved Hide resolved

dakinggg and others added 2 commits June 19, 2024 22:36

pc and docstring

0081bd2

Merge branch 'main' into reg-refactor

fe88a56

dakinggg marked this pull request as ready for review June 20, 2024 05:59

dakinggg requested a review from a team as a code owner June 20, 2024 05:59

dakinggg requested review from mvpatel2000 and milocress June 20, 2024 06:00

snarayan21 reviewed Jun 20, 2024

View reviewed changes

llmfoundry/models/layers/blocks.py Show resolved Hide resolved

dakinggg changed the title ~~[WIP] Extendability refactors~~ Extendability refactors Jun 20, 2024

mosaicml deleted a comment from snarayan21 Jun 20, 2024

milocress reviewed Jun 20, 2024

View reviewed changes

llmfoundry/data/finetuning/dataloader.py Show resolved Hide resolved

milocress reviewed Jun 20, 2024

View reviewed changes

llmfoundry/data/text_data.py Show resolved Hide resolved

milocress reviewed Jun 20, 2024

View reviewed changes

dakinggg added 2 commits June 20, 2024 11:17

try

3b03e54

more detailed docstring

dc7b6b5

dakinggg requested a review from milocress June 20, 2024 18:23

milocress approved these changes Jun 20, 2024

View reviewed changes

dakinggg added 2 commits June 20, 2024 12:18

pr comments

52bb3e3

pc

0b9dd07

dakinggg merged commit 8241f9c into mosaicml:main Jun 20, 2024
10 of 11 checks passed

dakinggg mentioned this pull request Jun 22, 2024

Add all transforms to train script #1300

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extendability refactors #1290

Extendability refactors #1290

dakinggg commented Jun 19, 2024 •

edited

Loading

snarayan21 left a comment

milocress left a comment

dakinggg commented Jun 20, 2024

Extendability refactors #1290

Extendability refactors #1290

Conversation

dakinggg commented Jun 19, 2024 • edited Loading

snarayan21 left a comment

Choose a reason for hiding this comment

milocress left a comment

Choose a reason for hiding this comment

dakinggg commented Jun 20, 2024

dakinggg commented Jun 19, 2024 •

edited

Loading