espnet2.schedulers.warmup_lr.WarmupLR

About 1 min

espnet2.schedulers.warmup_lr.WarmupLR

class espnet2.schedulers.warmup_lr.WarmupLR(optimizer: Optimizer, warmup_steps: int | float = 25000, last_epoch: int = -1)

Bases: _LRScheduler, AbsBatchStepScheduler

WarmupLR is a learning rate scheduler that gradually increases the learning

rate during training.

This scheduler is similar to the NoamLR Scheduler but with a key difference in the calculation of the learning rate. The formula for the learning rate in the WarmupLR scheduler is as follows:

lr = optimizer.lr * warmup_steps ** 0.5 * min(step ** -0.5, step * warmup_steps ** -1.5)

In contrast, the NoamLR scheduler computes the learning rate as:

lr = optimizer.lr * model_size ** -0.5 * min(step ** -0.5, step * warmup_step ** -1.5)

It is important to note that the maximum learning rate is equal to optimizer.lr in this scheduler.

warmup_steps

The number of warmup steps for the learning rate schedule.

Type: Union[int, float]
Parameters:
- optimizer (torch.optim.Optimizer) – The optimizer for which to schedule the learning rate.
- warmup_steps (Union *[*int , float ] , optional) – The number of warmup steps (default is 25000).
- last_epoch (int , optional) – The index of the last epoch (default is -1).

####### Examples

>>> import torch
>>> from torch.optim import Adam
>>> optimizer = Adam(params=[torch.randn(2, 2)], lr=0.001)
>>> scheduler = WarmupLR(optimizer, warmup_steps=1000)
>>> for epoch in range(2000):
...     scheduler.step()
...     print(scheduler.get_lr())

Raises:ValueError – If warmup_steps is not positive.

get_lr()

Calculate the learning rate based on the warmup schedule.

This method computes the learning rate for each parameter group in the optimizer, applying a warmup strategy that scales the learning rate based on the current training step. The learning rate is adjusted according to the formula:

lr = optimizer.lr * warmup_steps ** 0.5 * min(step ** -0.5, step * warmup_steps ** -1.5)

where step is the current training step.

Returns: A list of learning rates for each parameter group in the optimizer.
Return type: list

####### Examples

>>> optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
>>> scheduler = WarmupLR(optimizer, warmup_steps=1000)
>>> for epoch in range(2000):
>>>     scheduler.step()
>>>     print(scheduler.get_lr())  # Prints adjusted learning rates

NOTE

The maximum learning rate will equal to the base learning rate specified in the optimizer.