Remove lr scheduler in DeepSpeed config to avoid conflict #909

Haoxiang-Wang · 2023-12-04T04:38:52Z

This PR removed the following schduler in DeepSpeed config files.

{
  "scheduler": {
    "type": "WarmupDecayLR",
    "params": {
      "warmup_min_lr": "auto",
      "warmup_max_lr": "auto",
      "warmup_num_steps": "auto",
      "warmup_type": "linear",
      "total_num_steps": "auto"
    }
}

The scheduler in DeepSpeed configurations is not necessary because the axolotl package defines its own learning rate scheduler (which is passed to the HuggingFace trainer). In fact, the DeepSpeed scheduler can sometimes cause conflicts. For instance, when lr_scheduler: constant_with_warmup is set in the training YAML file, the actual scheduler is overridden by DeepSpeed's scheduler (which is DeepSpeed's WarmupDecayLR instead of HuggingFace's constant_with_warmup). Removing scheduler from DeepSpeed configurations can resolve this issue.

…xolotl-ai-cloud#909)

Remove learning rate scheduler in deepspeed config to avoid conflict

66d348c

winglian approved these changes Dec 4, 2023

View reviewed changes

winglian merged commit 476a205 into axolotl-ai-cloud:main Dec 4, 2023

mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023

Remove learning rate scheduler in deepspeed config to avoid conflict (a…

0701102

…xolotl-ai-cloud#909)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove lr scheduler in DeepSpeed config to avoid conflict #909

Remove lr scheduler in DeepSpeed config to avoid conflict #909

Haoxiang-Wang commented Dec 4, 2023 •

edited

Loading

Remove lr scheduler in DeepSpeed config to avoid conflict #909

Remove lr scheduler in DeepSpeed config to avoid conflict #909

Conversation

Haoxiang-Wang commented Dec 4, 2023 • edited Loading

Haoxiang-Wang commented Dec 4, 2023 •

edited

Loading