espnet3.systems.asr.system.ASRSystem
espnet3.systems.asr.system.ASRSystem
class espnet3.systems.asr.system.ASRSystem(, train_config: DictConfig | None = None, infer_config: DictConfig | None = None, metric_config: DictConfig | None = None, publish_config: DictConfig | None = None, demo_config: DictConfig | None = None, demo_config_path: Path | None = None)
Bases: BaseSystem
ASR-specific system.
This system adds: : - Tokenizer training inside train()
Initialize the system with optional stage configs.
create_dataset(*args, **kwargs)
Create datasets using the configured helper function.
The callable is resolved from train_config.create_dataset.func and invoked with the remaining configuration values.
- Raises:RuntimeError – If the configuration does not specify a function.
get_stage_log_dir(stage: str) → Path
Return stage-specific log directories when configured.
The ASR system routes logs to artifact directories when available: : - create_dataset: train_config.create_dataset.dataset_dir or train_config.dataset_dir or train_config.data_dir.
train_tokenizer:train_config.tokenizer.save_path.collect_stats:train_config.stats_dir.train/publish:train_config.exp_dir.infer:infer_config.infer_dir.measure:metric_config.infer_dirorinfer_config.infer_dir.
If none of the stage-specific paths are configured, it falls back to BaseSystem.get_stage_log_dir (train_config.exp_dir or <cwd>/logs).
- Parameters:stage (str) – Stage name being executed.
- Returns: Directory where the stage log should be placed.
- Return type: Path
pack_model(*args, **kwargs)
Pack model artifacts into an espnet3 bundle.
train(*args, **kwargs)
Train the model, training the tokenizer first if needed.
This stage checks for a cached tokenizer model and runs tokenizer training before delegating to the base training routine.
- Raises:RuntimeError – If
train_config.dataset_diris not set.
train_tokenizer(*args, **kwargs)
Train a SentencePiece tokenizer based on configured text.
The text builder configured in train_config.tokenizer.text_builder is used to generate training text, which is then saved and consumed by the SentencePiece trainer.
- Raises:RuntimeError – If required tokenizer config is missing or invalid.
