espnet2.schedulers.warmup_lr.WarmupLR
espnet2.schedulers.warmup_lr.WarmupLR
class espnet2.schedulers.warmup_lr.WarmupLR(optimizer: Optimizer, warmup_steps: int | float = 25000, last_epoch: int = -1)
Bases: _LRScheduler
, AbsBatchStepScheduler
WarmupLR is a learning rate scheduler that gradually increases the learning
rate during training.
This scheduler is similar to the NoamLR Scheduler but with a key difference in the calculation of the learning rate. The formula for the learning rate in the WarmupLR scheduler is as follows:
lr = optimizer.lr * warmup_steps ** 0.5 * min(step ** -0.5, step * warmup_steps ** -1.5)
In contrast, the NoamLR scheduler computes the learning rate as:
lr = optimizer.lr * model_size ** -0.5 * min(step ** -0.5, step * warmup_step ** -1.5)
It is important to note that the maximum learning rate is equal to optimizer.lr in this scheduler.
warmup_steps
The number of warmup steps for the learning rate schedule.
Type: Union[int, float]
Parameters:
- optimizer (torch.optim.Optimizer) – The optimizer for which to schedule the learning rate.
- warmup_steps (Union *[*int , float ] , optional) – The number of warmup steps (default is 25000).
- last_epoch (int , optional) – The index of the last epoch (default is -1).
####### Examples
>>> import torch
>>> from torch.optim import Adam
>>> optimizer = Adam(params=[torch.randn(2, 2)], lr=0.001)
>>> scheduler = WarmupLR(optimizer, warmup_steps=1000)
>>> for epoch in range(2000):
... scheduler.step()
... print(scheduler.get_lr())
- Raises:ValueError – If warmup_steps is not positive.
get_lr()
Calculate the learning rate based on the warmup schedule.
This method computes the learning rate for each parameter group in the optimizer, applying a warmup strategy that scales the learning rate based on the current training step. The learning rate is adjusted according to the formula:
lr = optimizer.lr * warmup_steps ** 0.5 * min(step ** -0.5, step * warmup_steps ** -1.5)
where step is the current training step.
- Returns: A list of learning rates for each parameter group in the optimizer.
- Return type: list
####### Examples
>>> optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
>>> scheduler = WarmupLR(optimizer, warmup_steps=1000)
>>> for epoch in range(2000):
>>> scheduler.step()
>>> print(scheduler.get_lr()) # Prints adjusted learning rates
NOTE
The maximum learning rate will equal to the base learning rate specified in the optimizer.