ESPnet3

Speech Foundation Model

Research Platform

Train · Evaluate · Scale

Get Started GitHub

Package Manager

$pip install espnet

9.4kGitHub stars

140+languages

Apache 2.0license

Recipes

Clone a recipe, run in minutes

Pick a built-in recipe and clone it locally — configs, data pipeline, and training script all included.

$ espnet3 clone librispeech/asr --project my_project

Creates my_project/ with the full recipe — edit & run immediately.

Available recipes

Browse all recipes

Guides

What is a recipeGuide

See how one ESPnet3 recipe holds configs, stages, dataset code, and outputs in one project.

FinetuningGuide

Adapt a cloned recipe to your own dataset, model, and training loop.

Migrating from ESPnet2Guide

Move an ESPnet2 recipe to ESPnet3 and understand the structure, config, and parallel changes.

From other toolkitsGuide

Read the high-level migration notes if your reference point is Hugging Face, NeMo, or SpeechBrain.

Research Pipeline

Stages

Click a stage, to see the details

Creating Dataset

Data loading & preprocessingView docs →

LibriSpeechCustom

dataset:
  train:
    data_src: librispeech/asr
    data_src_args:
      split: train-clean-100
  valid:
    data_src: librispeech/asr
    data_src_args:
      split: dev-clean

Collect Stats

Feature statistics & shape collectionView docs →

No configuration required

Training

Distributed model trainingView docs →

modeloptimizerstrainerdataloader

model:
  _target_: espnet3.systems.asr.task.ASRTask
  token_type: bpe
  ctc_weight: 0.3

Inference

Decoding & beam searchView docs →

dataset (single)dataset (multi)model

dataset:
  test:
    eval1:
      data_src: librispeech/asr
      data_src_args:
        split: test-clean

Metrics

Metrics & benchmark scoringView docs →

WERCustom

metrics:
  - _target_: espnet3.systems.asr.metrics.wer.WER
    normalize: true
    remove_whitespace: false

Publication

Pack & upload model and resultsView docs →

pack-modelupload-model

pack_model:
  out_dir: ${exp_dir}/model_pack
  include:
    - ${recipe_dir}/src
    - ${recipe_dir}/src
  exclude:
    - last.ckpt

Demo

Pack & upload interactive demoView docs →

inputsoutputs

inputs:
  - name: speech
    type: audio
  - name: prompt
    type: text

Research at Scale

Key features

Built for the foundation model era — distributed, multilingual, reproducible.

Parallel

See providers, runners, and Dask-backed execution patterns.

Config

Understand the YAML surface for training, inference, metrics, demo, and parallel.

Datasets

Read how recipe-local datasets, builders, and DataOrganizer fit together.

Demo

Follow the packaged demo flow from UI definition to runtime and packing.

Components

Read the reusable data, trainer, model, metrics, and optimizer layers.

Stages

Start from the stage map for training, inference, metrics, and publication.

Community

Contribute to ESPnet3

Add a recipe, extend the framework, or improve the docs — contributions of all sizes are welcome.

Dev setup

Get your local environment ready — editable install, linting, and test runner.

dev-setup

CI & pull requests

Understand the CI pipeline and what reviewers look for in a PR.

ci-and-pr

Add a stage

Extend the research pipeline with a new processing or evaluation stage.

adding-a-stage

Core overview

Read the architecture docs before changing Systems, components, parallel, or config behavior.

core

Writing docs

Edit Markdown, use Vue components, and understand the current VuePress setup.

writing-docs

Docstring guide

Write docs that are consistent, searchable, and auto-rendered.

docstring-guide

Docs

Tutorials

Resources