Support for freezing pretrained vision model layers with regex #3981

ethanreidel · 2024-04-02T21:33:57Z

Allows the user to input a regular expression in the yaml config which freezes specific layers of a pretrained model. Adds new CLI option "pretrained_summary" to let users access string representations of model layers for freezing via regex. Currently all pretrained torchvision models are accessible.

trainer:
layers_to_freeze_regex: (regex here)

ludwig pretrained_summary -m (model name here)

(I am aware that the collect_summary CLI command is similar, however it only accepts a preexisting directory so I thought creating a separate command to strictly output layer names was appropriate for this feature.)

Closes #3733

Future plans -> expand this capability to implement gradual unfreezing

Test: pytest tests/ludwig/modules/test_regex_freezing.py

…tils

This reverts commit 5df7362.

github-actions · 2024-04-02T22:06:26Z

Unit Test Results

  6 files ±0   6 suites ±0 18m 36s ⏱️ + 4m 15s
12 tests ±0   7 ✔️ -   2   5 💤 +  2 0 ❌ ±0
60 runs ±0 30 ✔️ - 12 30 💤 +12 0 ❌ ±0

Results for commit 4e62b92. ± Comparison against base commit 4b07ce4.

This pull request skips 2 tests.

tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.gbm.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.gbm.yaml]

♻️ This comment has been updated with latest results.

skanjila · 2024-04-06T19:10:52Z

ludwig/cli.py

+        from ludwig.utils import pretrained_summary
+
+        pretrained_summary.cli_summarize_pretrained(sys.argv[2:])
+


it might be good to add in the docs some example runs and outputs

@ethanreidel To second @skanjila -- would it be possible to show an example of running this command in terms of how it is different from the existing one -- and an example output. Thank you very much.

Qq: when you say you'd like an example, do you mean an example in the Ludwig docs or how would you prefer it?

@ethanreidel One option is to create an example in the examples/ top level directory in Ludwig

ludwig/schema/metadata/configs/trainer.yaml

skanjila · 2024-04-06T19:12:02Z

ludwig/utils/pretrained_summary.py

+from ludwig.contrib import add_contrib_callback_args
+from ludwig.globals import LUDWIG_VERSION
+from ludwig.utils.print_utils import print_ludwig
+


wait are we really supporting all of these models, I thought we were just going to go out the door with a couple of models to start?

For this specific feature (simple regex freezing), as long as you have access to the string representation of layers + actual model architecture, you can freeze any layers that you'd like. It wasn't any extra work adding support for all torchvision models besides adding to this list. I however don't like the look of this long model array though

@ethanreidel While it looks like for torchvision this will be supported, what about text/LLMs (this is kind of related to my previous comment in the Trainers section). Thanks!

For this specific feature (simple regex freezing), as long as you have access to the string representation of layers + actual model architecture, you can freeze any layers that you'd like. It wasn't any extra work adding support for all torchvision models besides adding to this list. I however don't like the look of this long model array though

@ethanreidel Sorry, could you please point me to this "long model array"? Which line in your code has it? Thanks!

For your first question Alex: as long as access to the model layers + their requires_grad parameter is available, in theory, this feature should work on LLMs/text. I'm not too familiar with LLM architecture and I'll have to do some quick checks, but I'm 99% sure it is an easy addition. Second question: in a previous commit, I had a pretty hacky solution where users had another command line option (under pretrained_summary) which would list all available model names. Those names were stored in a Python list which had a few issues namely having to expand it regularly/many lines of unnecessary code. Saad made a good point and said to fully remove it (it was not needed), so it's no longer there.

Just checked and sure enough you can apply the same regex freezing technique to an LLM

Just checked and sure enough you can apply the same regex freezing technique to an LLM
@ethanreidel That's awesome. Maybe we can then use one of my earlier comments to only add this parameter to the ECDTrainerConfig and FineTuneTrainerConfig for now.

As part of the examples you have, it would be good to create 2 example Python files:

To show how to use it with a computer vision model

To show how to use it with an LLM base model

What do you think?

skanjila · 2024-04-06T19:12:40Z

ludwig/utils/trainer_utils.py

+
+
+def freeze_layers_regex(config: "BaseTrainerConfig", model: ECD) -> None:
+    """Freezes layers based on provided regular expression."""


lets add all of the comments around inputs/outputs as well

@ethanreidel I think that if you put rom __future__ import annotations as the very first line in the module, you would not need to quote the types. Would you like to give it a try and see if it works? Thanks!

I tested the annotations import, and it worked, but the git pre-commit was forcing changes (e.g. converting all uppercase Dicts to lowercase dicts) that I didn't like.

@ethanreidel I think that makes sense.

Are you able to expand on the docstring itself for this function? Also, if it also supports LLM, can we make model a union of ECD and LLM?

ethanreidel · 2024-04-16T03:25:51Z

@saad-palapa

ludwig/utils/pretrained_summary.py

ludwig/utils/trainer_utils.py

arnavgarg1 · 2024-05-10T19:31:07Z

ludwig/schema/trainer.py

+    layers_to_freeze_regex: str = schema_utils.String(
+        default=None,
+        allow_none=True,
+        description=(
+            "Freeze specific layers based on provided regex. Freezing specific layers can improve a  "
+            "pretrained model's performance in a number of ways. At a basic level, freezing early layers can  "
+            "prevent overfitting by retaining more general features (beneficial for small datasets). Also can  "
+            "reduce computational resource use and lower overall training time due to less gradient calculations. "
+        ),
+    )
+


Instead of putting this in the base trainer config, what if we put it in the ECDTrainerConfig? For now, that will be good enough to ensure that this only works for ECD supported models and is not a valid argument/parameter for GBMs and LLMs. If it also works for LLMs, then we can also duplicate adding it in FineTuneTrainerConfig which is used by LLMs. It also means we don't have to modify any other trainers for now.

arnavgarg1 · 2024-05-10T19:34:16Z

ludwig/utils/pretrained_summary.py

+    model = encoder_class()
+
+    for name, _ in model.named_parameters():
+        print(name)


We generally don't like to use Prints in Ludwig code - can we use logger.info() instead?

import logging logger = logging.getLogger(__name__) logger.info("message")

arnavgarg1 · 2024-05-10T19:38:45Z

ludwig/utils/trainer_utils.py

+        pattern = re.compile(config.layers_to_freeze_regex)
+    except re.error:
+        logger.error(f"Invalid regex input: {config.layers_to_freeze_regex}")
+        exit()


Instead of exit(), let's raise a RuntimeError() with the same message.

In fact, here's a thought I have: We can move this check to earlier in the code path, that is, at config validation time. Specifically, you can create a __post_init__() hook for ECDTrainerConfig and FineTuneTrainerConfig that tries to do re.compile() and if it fails, throws a ConfigValidationError with the error message. That way, we don't have to wait for all of preprocessing etc to be done before catching this error.

Here's an example explaining the same idea in a different part of the Ludwig codepath: https://github.com/ludwig-ai/ludwig/blob/master/ludwig/schema/llms/peft.py#L443

arnavgarg1 · 2024-05-10T19:41:25Z

ludwig/utils/trainer_utils.py

+    matched = False
+    for name, p in model.named_parameters():
+        if re.search(pattern, str(name)):
+            p.requires_grad = False
+            matched = True
+    if not matched:
+        logger.error(f"No regex match for {config.layers_to_freeze_regex}! Check layer names and regex syntax.")


Two thoughts here:

Instead of logger.error, perhaps we can do a logger.warning()? I think it's okay if there are no matches, but we just want to claim that as a warning so the users can notice it as opposed to calling it an error (which it sort of is).

One thing that could be very useful here, is to also create a set of all the layers where the regex search actually returns true and requires grad gets set to false, and then log that full list! Observability can be super helpful

arnavgarg1

Overall, really nice work and clean implementation @ethanreidel! I left a few suggestions that might help simplify edge cases as well as considerations to add more observability into which layers are frozen.

arnavgarg1 · 2024-05-10T19:45:25Z

I would also recommend installing pre-commit via pip install pre-commit, then running pre-commit install within the Ludwig repo. That will help fix some of the pre-commit related styling errors here: https://results.pre-commit.ci/run/github/163346054/1714080170.FEtyFVFyR8m3t6xqARcnDQ

arnavgarg1 · 2024-05-28T16:16:14Z

@ethanreidel Let us know when this is ready for re-review!

ethanreidel · 2024-05-28T17:28:45Z

Hi @arnavgarg1! Thanks for checking in. The PR is good for re-review. 👍

alexsherstinsky

LGTM! @ethanreidel -- thank you for this amazing and very useful contribution!

…g-ai#3981)

ethanreidel added 18 commits March 21, 2024 20:47

added regex support for freezing specific layers

92366d5

fixed changes to trainer yaml config

cbe1b67

regen static schema

5df7362

added trainer schema changes

b7985f6

fixed var names

3a4507d

added unit test, cleaned up trainer code, added function in trainer_u…

9fe5df6

…tils

added training test

82167ce

cleaned up tests

3b8bbd3

misc comments/var name changes

a7683de

updated description of layers_to_freeze_regex parameter

5418a6a

Revert "regen static schema"

aeb2121

This reverts commit 5df7362.

fixed typo

1600d19

well another typo fix

58d18ef

initial summary CLI addition

f4e9cb4

removed try statement

192119b

added test and model list function

6649433

use pretrained off

5ff7e7c

use_pretrained false for test

ad77764

ethanreidel requested review from w4nderlust, tgaddair, justinxzhao, arnavgarg1, geoffreyangus, jeffkinnison, Infernaught and alexsherstinsky as code owners April 2, 2024 21:33

skanjila reviewed Apr 6, 2024

View reviewed changes

saad-palapa reviewed Apr 16, 2024

View reviewed changes

ludwig/utils/pretrained_summary.py Outdated Show resolved Hide resolved

ludwig/utils/trainer_utils.py Outdated Show resolved Hide resolved

arnavgarg1 reviewed May 10, 2024

View reviewed changes

ethanreidel added 14 commits May 15, 2024 17:08

updated collect summary + cli changes

23a8b3c

post init changes + trainer cleanup

1c0f173

updated unit test for LLM freezing

f96b5b3

two examples and various fixes

5546c83

small fix

7483111

fix

3210c56

spaces fix

66fb3de

added instructions for new functionality

f8a46ca

quick fixes

8fff02e

small llm test changes

d2e0690

added padding token for IT

1feb853

cleaned up llm+ecd tests

0c5b762

rmtree files examples, fixed llm freezing unittest

4e62b92

remove files at end of example

e85d16d

arnavgarg1 requested review from alexsherstinsky, saad-palapa, skanjila and arnavgarg1 May 31, 2024 14:31

alexsherstinsky approved these changes Jun 1, 2024

View reviewed changes

arnavgarg1 approved these changes Jun 1, 2024

View reviewed changes

alexsherstinsky merged commit 830c3f0 into ludwig-ai:master Jun 1, 2024
18 checks passed

skanjila pushed a commit to skanjila/ludwig that referenced this pull request Jun 7, 2024

Support for freezing pretrained vision model layers with regex (ludwi…

1572d7f

…g-ai#3981)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for freezing pretrained vision model layers with regex #3981

Support for freezing pretrained vision model layers with regex #3981

ethanreidel commented Apr 2, 2024 •

edited

Loading

github-actions bot commented Apr 2, 2024 •

edited

Loading

skanjila Apr 6, 2024

alexsherstinsky Apr 25, 2024

ethanreidel Apr 25, 2024

ethanreidel Apr 25, 2024

arnavgarg1 May 10, 2024

skanjila Apr 6, 2024

ethanreidel Apr 12, 2024

alexsherstinsky Apr 25, 2024

alexsherstinsky Apr 25, 2024

ethanreidel Apr 25, 2024

ethanreidel Apr 25, 2024

arnavgarg1 May 10, 2024

skanjila Apr 6, 2024

alexsherstinsky Apr 25, 2024

ethanreidel Apr 25, 2024

arnavgarg1 May 10, 2024

ethanreidel commented Apr 16, 2024

arnavgarg1 May 10, 2024

arnavgarg1 May 10, 2024

arnavgarg1 May 10, 2024

arnavgarg1 May 10, 2024

arnavgarg1 left a comment

arnavgarg1 commented May 10, 2024

arnavgarg1 commented May 28, 2024

ethanreidel commented May 28, 2024

alexsherstinsky left a comment

		from ludwig.utils import pretrained_summary

		pretrained_summary.cli_summarize_pretrained(sys.argv[2:])



		def freeze_layers_regex(config: "BaseTrainerConfig", model: ECD) -> None:
		"""Freezes layers based on provided regular expression."""

Support for freezing pretrained vision model layers with regex #3981

Support for freezing pretrained vision model layers with regex #3981

Conversation

ethanreidel commented Apr 2, 2024 • edited Loading

github-actions bot commented Apr 2, 2024 • edited Loading

Unit Test Results

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ethanreidel commented Apr 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnavgarg1 left a comment

Choose a reason for hiding this comment

arnavgarg1 commented May 10, 2024

arnavgarg1 commented May 28, 2024

ethanreidel commented May 28, 2024

alexsherstinsky left a comment

Choose a reason for hiding this comment

ethanreidel commented Apr 2, 2024 •

edited

Loading

github-actions bot commented Apr 2, 2024 •

edited

Loading