Skip to content

WarmupDecayLR.warmup_num_steps must not be 0 or 1 #771

@stas00

Description

@stas00

When using WarmupDecayLR, either

  • the config checker must ensure that warmup_num_steps is not 0 or 1, because it's a 1 / log(warmup_num_steps)
  • or the code needs to be smart enough to handle these 2 cases internally, .e.g:
warmup_num_steps = max(2, warmup_num_steps)

Thank you.

Fix: #772

Error log from the test:

    def __init__(self,
                 optimizer: Optimizer,
                 warmup_min_lr: float = 0.0,
                 warmup_max_lr: float = 0.001,
                 warmup_num_steps: int = 1000,
                 last_batch_iteration: int = -1):
    
        self.optimizer = get_torch_optimizer(optimizer)
    
        self.min_lrs = self._format_param(self.optimizer, warmup_min_lr, "min_lr")
        self.max_lrs = self._format_param(self.optimizer, warmup_max_lr, "max_lr")
        self.delta_lrs = [big - small for big, small in zip(self.max_lrs, self.min_lrs)]
        self.warmup_num_steps = warmup_num_steps
>       self.inverse_log_warm_up = 1.0 / math.log(warmup_num_steps)
E       ZeroDivisionError: float division by zero

DeepSpeed/deepspeed/runtime/lr_schedules.py:710: ZeroDivisionError

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions