espnet2.schedulers.warmup_step_lr.WarmupStepLR
espnet2.schedulers.warmup_step_lr.WarmupStepLR
class espnet2.schedulers.warmup_step_lr.WarmupStepLR(optimizer: Optimizer, warmup_steps: int | float = 25000, steps_per_epoch: int = 10000, step_size: int = 1, gamma: float = 0.1, last_epoch: int = -1)
Bases: _LRScheduler
, AbsBatchStepScheduler
Step (with Warm up) learning rate scheduler module.
The WarmupStepLR scheduler combines the functionalities of WarmupLR and StepLR:
WarmupLR: : lr = optimizer.lr * warmup_step ** 0.5 * min(step ** -0.5, step * warmup_step ** -1.5)
WarmupStepLR: : if step <= warmup_step: : lr = optimizer.lr * warmup_step ** 0.5 * min(step ** -0.5, step * warmup_step ** -1.5)
else: : lr = optimizer.lr * (gamma ** (epoch // step_size))
NOTE
The maximum learning rate equals optimizer.lr in this scheduler.
warmup_steps
The number of steps for the warmup phase.
- Type: Union[int, float]
steps_per_epoch
The number of steps per epoch.
- Type: int
step_size
The number of epochs between each learning rate decay.
- Type: int
gamma
The factor by which the learning rate is multiplied after the warmup phase.
- Type: float
last_epoch
The index of the last epoch. Default is -1.
Type: int
Parameters:
- optimizer (torch.optim.Optimizer) – The optimizer for which to schedule the learning rate.
- warmup_steps (Union *[*int , float ]) – The number of steps for the warmup phase (default is 25000).
- steps_per_epoch (int) – The number of steps per epoch (default is 10000).
- step_size (int) – The number of epochs between each learning rate decay (default is 1).
- gamma (float) – The factor by which the learning rate is multiplied after the warmup phase (default is 0.1).
- last_epoch (int) – The index of the last epoch (default is -1).
Returns: A list of updated learning rates for each parameter group.
Return type: List[float]
####### Examples
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
>>> scheduler = WarmupStepLR(optimizer, warmup_steps=1000,
... steps_per_epoch=100, step_size=10, gamma=0.1)
>>> for epoch in range(50):
... for batch in data_loader:
... optimizer.zero_grad()
... loss.backward()
... optimizer.step()
... scheduler.step()
get_lr()
Retrieves the learning rate for the current training step.
The learning rate is adjusted based on the warmup period and the step schedule. During the warmup phase, the learning rate increases based on the formula:
lr = optimizer.lr * warmup_step ** 0.5 * min(step ** -0.5, step * warmup_step ** -1.5)
After the warmup period, the learning rate is decreased by a factor of gamma every step_size epochs:
lr = optimizer.lr * (gamma ** (epoch // step_size))
step_num
The current step number.
- Type: int
epoch_num
The current epoch number.
- Type: int
warmup_steps
Number of warmup steps.
- Type: Union[int, float]
steps_per_epoch
Number of steps per epoch.
- Type: int
step_size
Number of epochs to wait before decreasing the learning rate.
- Type: int
gamma
Factor by which the learning rate is multiplied.
Type: float
Returns: The updated learning rates for the optimizer.
Return type: List[float]
####### Examples
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
>>> scheduler = WarmupStepLR(optimizer, warmup_steps=1000,
... steps_per_epoch=100, step_size=10,
... gamma=0.1)
>>> for step in range(2000):
... scheduler.step()
... print(scheduler.get_lr())
NOTE
The maximum learning rate equals to optimizer.lr in this scheduler.