espnet3.systems.asr.system.ASRSystem
espnet3.systems.asr.system.ASRSystem
class espnet3.systems.asr.system.ASRSystem(training_config: DictConfig | None = None, inference_config: DictConfig | None = None, metrics_config: DictConfig | None = None, publication_config: DictConfig | None = None, stage_log_mapping: dict | None = None, demo_config: DictConfig | None = None)
Bases: BaseSystem
ASR-specific system.
This system adds. : - Tokenizer training inside train()
Additional stage log paths. : - train_tokenizer -> training_config.tokenizer.save_path
Initialize the ASR system with optional stage configs.
- Parameters:
- training_config – Training configuration.
- inference_config – Inference configuration.
- metrics_config – Measurement configuration.
- publication_config – Publication configuration for model packing and upload stages.
- stage_log_mapping – Optional per-stage log directory overrides.
- demo_config – Demo configuration for demo packing and upload stages.
train(*args, **kwargs)
Train the model, training the tokenizer first if needed.
This stage checks for a cached tokenizer model and runs tokenizer training before delegating to the base training routine.
- Raises:RuntimeError – If neither dataset references nor
dataset_direxist.
train_tokenizer(*args, **kwargs)
Train a SentencePiece tokenizer based on configured text.
The text builder configured in training_config.tokenizer.text_builder is used to generate training text, which is then saved and consumed by the SentencePiece trainer.
- Raises:RuntimeError – If required tokenizer config is missing or invalid.
