Model and system
Model and system
If you already know PyTorch, the good news is:
- your model can still just be a PyTorch module
- ESPnet3 does not force you to hide everything behind one fixed system class
The two main things ESPnet3 adds are:
- config-driven model instantiation
- a
Systemclass that owns stage flow
The simple mental model
Think of the split like this:
src/model.py-> your model codetraining.yaml-> how the model is instantiatedSystem-> which stages run, and in what order
So if you come from PyTorch, ESPnet3 is not replacing your model. It is wrapping the workflow around it.
Custom model: src/model.py
The most direct pattern is to create:
egs3/<recipe>/<system>/src/model.pyand put your normal PyTorch model there.
Minimal example:
import torch
class MyModel(torch.nn.Module):
def __init__(self, hidden_size: int, vocab_size: int):
super().__init__()
self.encoder = torch.nn.Linear(80, hidden_size)
self.head = torch.nn.Linear(hidden_size, vocab_size)
def forward(self, speech, **batch):
hidden = self.encoder(speech)
return self.head(hidden)That is just ordinary PyTorch.
How to point config at the model
Use model._target_ in training.yaml.
Example:
task:
model:
_target_: src.model.MyModel
hidden_size: 256
vocab_size: 500This is the key point:
- leave
taskunset if you want direct model instantiation - put the import path under
model._target_
That is the clean ESPnet3 path for a custom model.
Model Components
See the task bridge path and direct model._target_ path.
Training Config
See where task, model, optimizer, and trainer settings live.
Custom Model
See the finetuning-oriented recipe-local model guide.
What this replaces
If you stay on the old task bridge, model: is interpreted by the task-side builder.
If you switch to model._target_, the recipe directly owns model instantiation.
So the practical choice is:
- task bridge for old task-style models
model._target_for recipe-local or custom models
Why System exists
If you come from PyTorch, it is tempting to think only about the model.
But recipes often need workflow logic too:
- dataset preparation
- training
- inference
- metrics
- publication
- special recipe-local stages
That is what System owns.
So:
- model = computation
- system = workflow
System and Stages
See how Systems own stage methods and config wiring.
Stages
See the built-in stage entrypoints and their config inputs.
Recipe Structure
See where src, conf, data, and exp files live in a recipe.
When a custom system is useful
A custom system becomes useful when the recipe needs behavior that is not just "one standard train stage".
Examples:
- special preprocessing stage
- export stage
- multi-phase training
- curriculum training
- any workflow that needs multiple training passes with different configs
Example: curriculum training with two training stages
Suppose you want:
- easy training first
- full training second
Then a clean ESPnet3 pattern is:
- create two training configs
- expose both configs in
run.py - add two system-specific stages
Step 1: create two configs
For example:
conf/
training_easy.yaml
training_full.yamlThe easy config might use:
- smaller subset
- shorter utterances
- easier curriculum
- fewer epochs
The full config then uses the full training setup.
Step 2: extend run.py
You can add two CLI flags:
parser.add_argument("--training_easy_config", type=Path, default=None)
parser.add_argument("--training_full_config", type=Path, default=None)Then load both configs and pass them into your system.
Step 3: define a custom system
Example:
from espnet3.systems.asr.system import ASRSystem
from espnet3.systems.base.training import train
class CEMOESSystem(ASRSystem):
def __init__(
self,
training_easy_config=None,
training_full_config=None,
inference_config=None,
metrics_config=None,
publication_config=None,
demo_config=None,
):
super().__init__(
training_config=training_full_config,
inference_config=inference_config,
metrics_config=metrics_config,
publication_config=publication_config,
demo_config=demo_config,
)
self.training_easy_config = training_easy_config
self.training_full_config = training_full_config
def train_easy(self):
original = self.training_config
self.training_config = self.training_easy_config
try:
return train(self.training_config)
finally:
self.training_config = original
def train_full(self):
original = self.training_config
self.training_config = self.training_full_config
try:
return train(self.training_config)
finally:
self.training_config = originalThe idea is simple:
train_easy()uses the easy configtrain_full()uses the full config
Step 4: expose new stages
In run.py, add the new stage names:
ALL_STAGES = [
*DEFAULT_STAGES,
"train_easy",
"train_full",
]Then run them like ordinary stages:
python run.py \
--stages train_easy \
--training_easy_config conf/training_easy.yamland later:
python run.py \
--stages train_full \
--training_full_config conf/training_full.yamlThat is the kind of recipe-local workflow logic that System is for.
Adding a Stage
See how to expose recipe-local stages from a custom System.
Training Loop
See when to use config, LightningModule, trainer, or System code.
Config Overview
See how stage-specific configs are loaded and passed to Systems.
Why this is better than ad-hoc script logic
If you came from plain PyTorch, you might otherwise write:
- one Python script for easy training
- another Python script for full training
But in ESPnet3, it is often cleaner to keep one recipe and make the stage flow explicit through the system.
That keeps:
- config
- logs
- stages
- output directories
under one consistent recipe structure.
Good rule of thumb
Use:
src/model.pywhen the model is custom- a custom
Systemwhen the workflow is custom
Do not use a custom system just because the model is custom. Use it when the stage flow itself needs to change.
Related pages
Custom model
See the finetuning-oriented version of recipe-local model customization.
Adding a stage
See how to add recipe-local stages to a system.
Training loop
See where to customize LightningModule, trainer, or stage flow.
Training Config
See how `model`, `optimizer`, `dataloader`, and `trainer` are configured.
System and stages
Read the architecture-level explanation of `System` and stage dispatch.
