espnet2.schedulers.warmup_step_lr.WarmupStepLR

About 2 min

espnet2.schedulers.warmup_step_lr.WarmupStepLR

class espnet2.schedulers.warmup_step_lr.WarmupStepLR(optimizer: Optimizer, warmup_steps: int | float = 25000, steps_per_epoch: int = 10000, step_size: int = 1, gamma: float = 0.1, last_epoch: int = -1)

Bases: _LRScheduler, AbsBatchStepScheduler

Step (with Warm up) learning rate scheduler module.

The WarmupStepLR scheduler combines the functionalities of WarmupLR and StepLR:

WarmupLR: : lr = optimizer.lr * warmup_step ** 0.5 * min(step ** -0.5, step * warmup_step ** -1.5)

WarmupStepLR: : if step <= warmup_step: : lr = optimizer.lr * warmup_step ** 0.5 * min(step ** -0.5, step * warmup_step ** -1.5)
else: : lr = optimizer.lr * (gamma ** (epoch // step_size))

NOTE

The maximum learning rate equals optimizer.lr in this scheduler.

warmup_steps

The number of steps for the warmup phase.

Type: Union[int, float]

steps_per_epoch

The number of steps per epoch.

Type: int

step_size

The number of epochs between each learning rate decay.

Type: int

gamma

The factor by which the learning rate is multiplied after the warmup phase.

Type: float

last_epoch

The index of the last epoch. Default is -1.

Type: int
Parameters:
- optimizer (torch.optim.Optimizer) – The optimizer for which to schedule the learning rate.
- warmup_steps (Union *[*int , float ]) – The number of steps for the warmup phase (default is 25000).
- steps_per_epoch (int) – The number of steps per epoch (default is 10000).
- step_size (int) – The number of epochs between each learning rate decay (default is 1).
- gamma (float) – The factor by which the learning rate is multiplied after the warmup phase (default is 0.1).
- last_epoch (int) – The index of the last epoch (default is -1).
Returns: A list of updated learning rates for each parameter group.
Return type: List[float]

####### Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
>>> scheduler = WarmupStepLR(optimizer, warmup_steps=1000,
...                          steps_per_epoch=100, step_size=10, gamma=0.1)
>>> for epoch in range(50):
...     for batch in data_loader:
...         optimizer.zero_grad()
...         loss.backward()
...         optimizer.step()
...         scheduler.step()

get_lr()

Retrieves the learning rate for the current training step.

The learning rate is adjusted based on the warmup period and the step schedule. During the warmup phase, the learning rate increases based on the formula:

lr = optimizer.lr * warmup_step ** 0.5 * min(step ** -0.5, step * warmup_step ** -1.5)

After the warmup period, the learning rate is decreased by a factor of gamma every step_size epochs:

lr = optimizer.lr * (gamma ** (epoch // step_size))

step_num

The current step number.

Type: int

epoch_num

The current epoch number.

Type: int

warmup_steps

Number of warmup steps.

Type: Union[int, float]

steps_per_epoch

Number of steps per epoch.

Type: int

step_size

Number of epochs to wait before decreasing the learning rate.

Type: int

gamma

Factor by which the learning rate is multiplied.

Type: float
Returns: The updated learning rates for the optimizer.
Return type: List[float]

####### Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
>>> scheduler = WarmupStepLR(optimizer, warmup_steps=1000,
...                           steps_per_epoch=100, step_size=10,
...                           gamma=0.1)
>>> for step in range(2000):
...     scheduler.step()
...     print(scheduler.get_lr())

NOTE

The maximum learning rate equals to optimizer.lr in this scheduler.