ESPnet3 Train Stage
About 1 min
ESPnet3 Train Stage
The train stage runs model training using:
espnet3.systems.base.training.trainespnet3.components.modeling.lightning_module.ESPnetLightningModuleespnet3.components.trainers.trainer.ESPnet3LightningTrainer
Quick usage
Run
python run.py --stages train --training_config conf/training.yamlConfigure (in training.yaml)
Keep the core settings in training.yaml.
| Section | Description |
|---|---|
task | task entrypoint for ESPnet2-style models |
model | model definition and normalization settings |
dataset | train and valid splits |
dataloader | collate and iterator settings |
trainer | Lightning trainer configuration |
optimizer, scheduler | single-optimizer training path |
optimizers, schedulers | named multi-optimizer path |
exp_dir | training output directory |
stats_dir | stats output directory used by collect_stats |
Main config sections
Training uses training.yaml.
Typical sections are:
taskormodeldatasetdataloaderoptimizer/scheduleroroptimizers/schedulerstrainerexp_dirstats_dir
Outputs
Training writes under exp_dir, including:
- checkpoints
- logs
- TensorBoard output if configured
collect_stats writes under stats_dir.
Typical outputs are written under:
exp_dir: checkpoints, logs, TensorBoard files, saved configsstats_dir: feature shapes and normalization stats fromcollect_stats
Key ideas
Dataset
Current dataset definitions are based on DataOrganizer plus dataset reference entries using data_src and data_src_args.
Example:
dataset:
_target_: espnet3.components.data.data_organizer.DataOrganizer
recipe_dir: ${recipe_dir}
train:
- name: train
data_src: mini_an4/asr
data_src_args:
split: train
data_path: ${dataset_dir}
valid:
- name: valid
data_src: mini_an4/asr
data_src_args:
split: valid
data_path: ${dataset_dir}Details:
Dataloader
| Topic | Summary | Details |
|---|---|---|
dataloader | supports ESPnet iterator mode and plain PyTorch DataLoader mode | Dataloader + Collate |
trainer | uses the ESPnet3 Lightning trainer wrapper, not raw Lightning directly | Trainer |
optimizer, scheduler, optimizers, schedulers | supports both single-optimizer and named multi-optimizer training | Optimizer + Scheduler, Multiple optimizers and schedulers |
model | supports both task-backed ESPnet2 model construction and direct Hydra instantiation | Model |
callbacks | handles logging, checkpointing, and metric reporting such as MetricsLogger | Callbacks |
Details by topic
The train stage is intentionally thin: most customization happens in one of those component configs rather than in a stage-specific CLI path.
