Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable param group configuration in llm-foundry #760

Merged
merged 11 commits into from
Nov 29, 2023

Conversation

vchiley
Copy link
Contributor

@vchiley vchiley commented Nov 22, 2023

This PR enables param group configuration in llm-foundry.

The optimizer_config defines the optimizer args.
This PR allows the user to additionally have key disable_grad which is a string or list of strings. If a string matches a parameter name, then that parameter will have requires_grad=False. This is useful for freezing parameters.
This PR additionally allows the user to specify key param_groups which is a list of dicts. In this dict, key param_str_match defines a string; if a parameter name contains this string, then it will be in this parameter group. This is useful for grouping parameters together. The dict can also contain any other key that is a valid optimizer arg.
Note: to handle name overlap conflicts, params are assigned to parameter groups and added to param_groups in the order that param_str_match appear in param_groups.

Param name comparisons are done using RegEx search.

Usage
To disable gradient for all parameters that contain the string "norm" or "bias":

    optimizer_config: {
        "name": "decoupled_lionw",
        "lr": 1e-3,
        "weight_decay": 1e-2,
        "betas": [0.9, 0.999],
        "eps": 1e-8,
        "disable_grad": ["norm", "bias"]
    }

or in the yaml as:

optimizer:
  name: decoupled_lionw
  lr: 1e-3
  weight_decay: 1e-2
  betas:
  - 0.9
  - 0.999
  eps: 1e-8
  disable_grad:
  - norm
  - bias

To create modify the optimizer parameters for all parameters that contain the string "norm" separately:

    optimizer_config: {
        "name": "decoupled_lionw",
        "lr": 1e-3,
        "weight_decay": 1e-2,
        "betas": [0.9, 0.999],
        "eps": 1e-8,
        "param_groups": [
            {
                "param_str_match": "norm",
                "lr": 1e-4,
                "weight_decay": 0.0,
            },
        ],
    }

of in yaml form:

optimizer:
  name: decoupled_lionw
  lr: 1e-3
  weight_decay: 1e-2
  betas:
  - 0.9
  - 0.999
  eps: 1e-8
  param_groups:
  - param_str_match: norm
    lr: 1e-4
    weight_decay: 0

@vchiley vchiley requested review from dakinggg and removed request for dakinggg November 22, 2023 23:10
@vchiley
Copy link
Contributor Author

vchiley commented Nov 24, 2023

potential users: @sashaDoubov @samhavens @b-chu @bcui19 @ShashankMosaicML

@vchiley vchiley marked this pull request as ready for review November 24, 2023 04:41
Copy link
Contributor

@j316chuck j316chuck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice ✅ . Left a comment but thanks for the clean implementation.

llmfoundry/utils/builders.py Show resolved Hide resolved
llmfoundry/optim/lion8b.py Show resolved Hide resolved
tests/test_builders.py Show resolved Hide resolved
llmfoundry/utils/builders.py Outdated Show resolved Hide resolved
llmfoundry/utils/builders.py Outdated Show resolved Hide resolved
llmfoundry/utils/builders.py Outdated Show resolved Hide resolved
@vchiley vchiley merged commit 5f21855 into mosaicml:main Nov 29, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants