fix fsdp_auto_wrap_policy #2167

eljandoubi · 2024-10-19T16:49:57Z

fix the issue that fsdp_auto_wrap_policy is not working when FSDP_TRANSFORMER_CLS_TO_WRAP and the model's _no_split_modules are None
@BenjaminBossan @sayakpaul

Fixes #2166

BenjaminBossan · 2024-10-21T09:23:20Z

Thanks for working on this fix. Could you give an example of a model that would currently fail but work with this fix? Ideally, we can build a unit test based on this.

eljandoubi · 2024-10-21T11:19:10Z

Donut
Pix2Struct

BenjaminBossan · 2024-10-21T12:25:34Z

I could not replicate for Donut:

>>> from peft.utils.other import fsdp_auto_wrap_policy
>>> model = DonutSwinPreTrainedModel.from_pretrained('naver-clova-ix/donut-base')
>>> model._no_split_modules
['DonutSwinStage']
>>> fsdp_auto_wrap_policy(model)  # works

For Pix2struct, I do get:

>>> model = Pix2StructForConditionalGeneration.from_pretrained("google/pix2struct-base")
>>> model._no_split_modules
None
>>> fsdp_auto_wrap_policy(model)
Exception: Could not find the transformer layer class to wrap in the model.

but of course I don't need to use PEFT's fsdp_auto_wrap_policy for FSDP training, it's just there to help users. Do you have a use case where you can't switch to another auto wrap policy?

eljandoubi · 2024-10-21T12:42:43Z

In the Transformers Trainer class, it uses fsdp_auto_wrap_policy in _fsdp_qlora_plugin_updates, which is automatically applied when training a PeftModel in FSDP mode.

BenjaminBossan · 2024-10-21T14:10:54Z

Thanks a lot for the pointer, it makes sense that Trainer should still work in that case.

Let's add a small test to ensure that the function does not fail in such cases. We already have a test class here:

peft/tests/test_gpu_examples.py

Line 3737 in e8259ff

class TestFSDPWrap:

We can just add the new test in there. As to the model, I think it's sufficient to just create a custom model and check that calling fsdp_auto_wrap_policy does not raise an error. So something like:

    def test_fsdp_auto_wrap_policy_does_not_raise_on_custom_model(self):
        # See #2167
        # Avoid raising on custom models since Trainer uses fsdp_auto_wrap_policy automatically for PEFT + FSDP
        class MyModule(nn.Module):
            def __init__(self):
                super().__init__()
                self.lin = nn.Linear(2, 3)

        fsdp_auto_wrap_policy(MyModule())  # does not raise

eljandoubi · 2024-10-21T14:57:44Z

You already have a toy model called SimpleModel. I am using it for testing.

HuggingFaceDocBuilderDev · 2024-10-22T09:19:40Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan

Thanks for investigating the issue and providing a fix. LGTM.

You already have a toy model called SimpleModel. I am using it for testing.

Good catch.

eljandoubi added 2 commits October 19, 2024 18:11

skip empty string

80f95bf

fix style

5cfb9fc

add unit test for fsdp_auto_wrap_policy

93fa5c1

BenjaminBossan approved these changes Oct 22, 2024

View reviewed changes

BenjaminBossan merged commit 7717550 into huggingface:main Oct 22, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix fsdp_auto_wrap_policy #2167

fix fsdp_auto_wrap_policy #2167

eljandoubi commented Oct 19, 2024 •

edited

Loading

BenjaminBossan commented Oct 21, 2024

eljandoubi commented Oct 21, 2024

BenjaminBossan commented Oct 21, 2024

eljandoubi commented Oct 21, 2024

BenjaminBossan commented Oct 21, 2024

eljandoubi commented Oct 21, 2024

HuggingFaceDocBuilderDev commented Oct 22, 2024

BenjaminBossan left a comment

fix fsdp_auto_wrap_policy #2167

fix fsdp_auto_wrap_policy #2167

Conversation

eljandoubi commented Oct 19, 2024 • edited Loading

BenjaminBossan commented Oct 21, 2024

eljandoubi commented Oct 21, 2024

BenjaminBossan commented Oct 21, 2024

eljandoubi commented Oct 21, 2024

BenjaminBossan commented Oct 21, 2024

eljandoubi commented Oct 21, 2024

HuggingFaceDocBuilderDev commented Oct 22, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

eljandoubi commented Oct 19, 2024 •

edited

Loading