Getting Started with ESPnet3
π Getting Started with ESPnet3
This guide provides the fastest way to start using ESPnet3.
Choose the workflow that fits your environment and follow the examples below.
β‘ Quick Start (ASR Example)
1. Install ESPnet3
ESPnet3 is distributed under the same pip package name: espnet. For more installation options (uv, pixi, source), see ESPnet3 Installation.
pip install espnetInstall from source (recommended for development)
git clone https://github.com/espnet/espnet.git
cd espnet/tools
# Recommended: setup_uv.sh
# Installs pixi + uv and sets up all dependencies much faster than conda.
. setup_uv.shπ¦ 2. Install system-specific dependencies
ESPnet3 introduces the concept of a system (ASR, TTS, ST, ENH, etc.). Each system may require additional packages not used by others.
Install system extras using:
pip install "espnet[asr]"Other examples:
pip install "espnet[tts]"
pip install "espnet[st]"
pip install "espnet[enh]"If installed from a cloned repository:
pip install -e ".[asr]"
# or using uv:
uv pip install -e ".[asr]"π§ͺ 3. Run a recipe without cloning the repository
(import-based execution)
ESPnet3 recipes are fully importable. Create config files locally and run:
from argparse import Namespace
from pathlib import Path
from egs3.TEMPLATE.asr.run import main
from espnet3.systems.asr.system import ASRSystem
stages = ["create_dataset", "collect_stats", "train", "infer", "measure"]
args = Namespace(
stages=stages,
train_config=Path("/path/to/train_config.yaml"),
infer_config=Path("/path/to/infer_config.yaml"),
measure_config=Path("/path/to/measure_config.yaml"),
publish_config=None,
demo_config=None,
dry_run=False,
write_requirements=False,
)
main(args=args, system_cls=ASRSystem, stages=stages)This is useful for programmatic pipelines or MLOps workflows.
π₯ 4. Run a recipe with a cloned repository
All configs and scripts live inside egs3/.
Example: LibriSpeech 100h ASR
cd egs3/librispeech_100/asr
python run.py \
--stages all \
--train_config conf/train.yaml \
--infer_config conf/infer.yaml \
--measure_config conf/measure.yamlπ§ Understanding Stages
The default stage order is defined in:
egs3/TEMPLATE/<system>/run.pyTypical ASR pipeline:
- create_dataset (download/prepare raw data)
- collect_stats (compute CMVN/statistics)
- train (fit the model)
- infer (generate hypotheses)
- measure (compute metrics)
- pack_model / upload_model (package + upload artifacts)
You can run selected stages:
python run.py \
--stages train infer measure \
--train_config conf/train.yaml \
--infer_config conf/infer.yaml \
--measure_config conf/measure.yamlπ§΅ Stage-specific arguments
Stages do not accept arbitrary CLI arguments. Keep all stage settings in the YAML configs and pass the configs via --train_config, --infer_config, and --measure_config.
No code changes inside the system class are needed.
π§© Implementing src/ for your recipe
Each recipe may define custom logic inside:
egs3/<recipe>/<task>/src/Typical files:
- create_dataset.py - dataset preparation functions
- dataset.py - dataset builder or transform classes
- custom_model.py - user-defined modules referenced by Hydra configs
run.py automatically adds this directory to PYTHONPATH.
βοΈ Configurations (Hydra)
All hyperparameters live in conf/*.yaml.
Important: ESPnet3 disables CLI overrides (--key=value). This is because ESPnet3 relies on hierarchical config merging that conflicts with Hydra's runtime override semantics. All overrides must be written inside YAML files.
β Putting Everything Together (cloned repository workflow)
Start from:
egs3/TEMPLATE/asr/run.pyReplace ASRSystem if you define a custom system. Then:
cd egs3/<your_recipe>/<task>
# Dataset preparation
python run.py --stages create_dataset --train_config conf/train.yaml
# (Optional) collect_stats + training
python run.py --stages collect_stats train --train_config conf/train.yaml
# Evaluation
python run.py --stages infer measure --infer_config conf/infer.yaml --measure_config conf/measure.yamlOutputs go to:
exp/β training logs + checkpointsinfer_dir/β inference outputs + measures.json
π Additional ESPnet3 Documentation
β Cheat sheet: what you touch vs. what's provided
| Goal | You mainly edit / run | Read next |
|---|---|---|
| Define datasets and loaders | conf/dataset*.yaml, DataOrganizer config | DataOrganizer and dataset pipeline |
| Configure training | conf/train.yaml (model, trainer, optim) | Optimizer configuration, Callbacks |
| Run multi-GPU / cluster | conf/train.yaml + parallel blocks | Multi-GPU / multi-node, Train config |
| Set up evaluation | conf/infer.yaml + conf/measure.yaml | Inference, Measure, Provider / Runner |
Execution Framework
- Provider / Runner: Provider / Runner
- Parallel configuration: Parallel
Data & Datasets
- Data preparation examples: Data preparation
- Dataset classes & sharding: DataOrganizer and dataset pipeline
Training
- Callbacks: Callbacks
- Optimizers: Optimizer configuration
- Multiple optimizers/schedulers: Multiple optimizers & schedulers
- Multi-GPU & multi-node: Multi-GPU / multi-node
Inference & Evaluation
- Runner-based decoding: Provider / Runner
- Inference pipeline: Inference
- Measurement pipeline: Measure
Recipe Structure
- Recipe directory layout: Recipe directory layout
- Systems: Systems
π‘ Tips for Working With Recipes
- Keep configs modular: dataset / model / trainer / parallel blocks.
- Decide early whether execution is local or SLURM/cluster.
- Use import-based execution for MLOps pipelines.
- Reuse ESPnet2 model configs where possible.
