From Hugging Face

About 1 min

From Hugging Face

If you come from Hugging Face, the main difference is scope.

Hugging Face often starts from a model and a Trainer
ESPnet3 often starts from a recipe, a System, and stage configs

Do not try to map every class one-to-one first. Map the workflow first.

Rough mental mapping

Hugging Face	ESPnet3
model class	`model` in `training.yaml`, often `src/model.py`
`Trainer`	`train` stage + Lightning trainer wrapper
dataset object	`dataset.py` or `builder.py` + `dataset:` config
preprocessing function	dataset transform, builder step, or collate function
generation config	`inference.yaml`
evaluation script	`metrics.yaml` + `measure` stage
model repo artifact	publication bundle

What usually feels familiar

These parts are usually easy to understand:

normal PyTorch model code still works
normal torch.utils.data.Dataset still works
optimizer and scheduler config is explicit
distributed training uses familiar backend settings under trainer:

What usually feels different

These are the main shifts:

config is split by stage, not by one big training script
inference is a first-class stage, not just generate() calls
recipes are expected to own data preparation too
publication and demo flows are part of the system design

A practical migration strategy

Move in this order:

get your dataset loading working
get your model instantiated from training.yaml
run one train stage locally
add inference.yaml
add metrics.yaml only after inference output looks right

Do not start by porting every helper utility.

When to keep code recipe-local

Keep code under src/ when it is specific to one recipe:

src/model.py
src/system.py
src/dataset.py
src/trainer.py
src/lightning_module.py

Move code into espnet3/ only when it is reusable across recipes.

Good pages to read next

What is a recipe?

See how ESPnet3 organizes one experiment as one recipe directory.

Coming from PyTorch

See the closest mental model if you already understand raw PyTorch training code.

Config overview

See how `training.yaml`, `inference.yaml`, and `metrics.yaml` split the workflow.

Custom dataset

See how to plug in your own dataset without adopting a special dataset base class.

Customize the model

See how to use `src/model.py` and switch away from the task bridge when needed.