Multiple Optimizers and Schedulers in ESPnet3
🧮 Multiple Optimizers and Schedulers in ESPnet3
ESPnet3 lets you configure one optimizer/scheduler pair or multiple optimizer and scheduler groups. This guide explains the rules enforced by LitESPnetModel.configure_optimizers, how parameter selection works, and how to debug common errors.
✅ What you set up here
| Goal | YAML fields to touch | Notes |
|---|---|---|
| Single optimizer | optim, optional scheduler | Easiest setup; whole model shares settings |
| Different LR per module | Multiple entries under optims | Use params to select parameter subsets |
| Different schedulers | Matching entries under schedulers | Order must match optims one-to-one |
Single optimizer + scheduler (baseline)
Use when all trainable parameters share one optimizer and scheduler:
optim:
_target_: torch.optim.AdamW
lr: 0.001
weight_decay: 1.0e-2
scheduler:
_target_: torch.optim.lr_scheduler.CosineAnnealingLR
T_max: 100000- Do not define
optims/schedulersin this mode. - The scheduler receives the instantiated optimizer automatically.
Multiple optimizers + schedulers
Enable when different parameter groups need distinct optimizers or learning rate schedules (e.g., encoder vs. decoder, GAN-style training).
optims:
- params: encoder # substring match against parameter names
optim:
_target_: torch.optim.Adam
lr: 5.0e-4
weight_decay: 1.0e-2
- params: decoder
optim:
_target_: torch.optim.Adam
lr: 1.0e-3
schedulers:
- scheduler:
_target_: torch.optim.lr_scheduler.StepLR
step_size: 5
gamma: 0.5
- scheduler:
_target_: torch.optim.lr_scheduler.StepLR
step_size: 10
gamma: 0.1How grouping works
- Each
optimsentry must includeparamsandoptim. paramsis matched as a substring against named parameters. All matches go to that optimizer.- Every trainable parameter must match exactly one optimizer. ESPnet3 will error if a parameter is missing or matched twice.
- The number of
schedulersmust equal the number ofoptims; they are paired by position (first scheduler controls the first optimizer, etc.). - Internally ESPnet3 wraps the list in
MultipleOptimandMultipleSchedulerso Lightning still sees a single optimizer with step-level scheduling.
Common pitfalls and fixes
No trainable parameters found for substring
Checkparamsstrings againstmodel.named_parameters(). Consider using a more specific prefix (e.g.,encoder.).Parameter is assigned to multiple optimizers
Ensure substrings do not overlap unintentionally (e.g.,encandencoderboth matching). Make substrings mutually exclusive.Unused parameters reported
Add anotheroptimsentry or adjust substrings so every trainable parameter is covered.Mixed modes
Do not mixoptimwithoptimsorschedulerwithschedulersin the same config.
Debugging tip
To see which parameters matched each group, insert a quick check before training:
for name, _ in model.named_parameters():
if "encoder" in name:
print("encoder group:", name)
if "decoder" in name:
print("decoder group:", name)Use this to refine the params substrings until grouping matches your intent.
