Naming Conventions
Naming Conventions
Naming matters a lot in ESPnet3.
Good names make configs easier to read, stage flows easier to follow, and code search much more reliable for both humans and coding agents.
This page summarizes the naming patterns currently used in the ESPnet3 codebase and docs.
Important
Prefer existing names over inventing new ones. If the repository already has a stable term for a concept, reuse that term across code, configs, docs, and file paths.
General rule
A good ESPnet3 name should be:
- consistent with existing code
- easy to search
- specific about its role
- stable across code, config, and docs
If a new name would create a synonym for an existing concept, do not add it.
Python naming
Follow normal Python naming first:
- functions and variables:
lower_with_under - classes:
CapWords - constants:
CAPS_WITH_UNDER - module files:
lower_with_under.py
For internal helpers, use a leading underscore:
_normalize_stage_name_load_inference_config
Function names should read like actions
In practice, ESPnet3 function names usually start with a verb or an action-oriented prefix.
Good examples:
load_config_with_defaultsresolve_stagesrun_stagescollect_statsprepare_sentencesupload_model
Avoid vague noun-like function names such as:
stage_configdataset_modulemodel_state
If the function does something, the name should sound like that action.
Note
Common ESPnet3 action prefixes include load, build, run, collect, validate, prepare, infer, measure, pack, and upload.
Class names should read like entities
Class names should look like nouns, roles, or objects.
Good examples:
BaseSystemInferenceRunnerDatasetBuilderESPnetLightningModule
Avoid class names that read like commands:
LoadDatasetRunInferenceBuildTrainer
If the code represents a thing, the name should read like a thing.
File names should also read like entities
Python file names should also look like nouns or role names, not verb phrases.
Good examples:
system.pyinference_runner.pydataset_builder.pyconfig_utils.py
Avoid names like:
run_stage.pyload_config.pyprepare_dataset.py
When a file holds a role or domain concept, name it after that concept.
Prefer inference, not decode
For new ESPnet3 code, do not introduce decode when the concept is inference output or inference-time behavior.
Historically, decode was a natural name for ASR-style pipelines. But ESPnet3 now covers more generative systems, and many of those workflows are not best described as "decoding". inference is the broader and more stable term, so new naming should prefer that direction.
Prefer names such as:
inferenceinference_dirinference_configInferenceProvider
Avoid new names such as:
decode_dirdecode_configdecode_output
This rule applies to:
- config keys
- path names
- stage-facing field names
- publication settings
- docs and examples
Warning
Older code may still contain decode in some places. Do not copy that naming into new ESPnet3 work unless you are intentionally working inside a legacy compatibility boundary.
Use snake_case for stage names and config keys
Stage names should be short verb-style snake_case names:
create_datasettrain_tokenizercollect_statsprepare_labelspack_model
Config keys and path-related fields should also use snake_case:
training_configinference_diroutput_keysstage_log_mapping
Do not mix styles such as camelCase or kebab-case into Python-facing config.
Imports under egs3/
Do not use relative imports inside egs3/.
Use explicit absolute module paths:
from egs3.mini_an4.asr.dataset.builder import MiniAn4BuilderAvoid:
from .builder import MiniAn4BuilderAbsolute imports are easier to search, easier to move, and easier for coding agents to reason about.
Name by boundary
Use names that match the boundary where the code lives.
In shared espnet3/ code:
- prefer reusable, system-level names
- avoid recipe-specific corpus names
In recipe-local egs3/<recipe>/<system>/ code:
- recipe-specific names are fine
- names should still stay consistent with the shared API
Examples:
- shared:
InferenceRunner,DataOrganizer,BaseMetric - recipe-local:
MiniAn4Builder,LibriSpeechDataset
Do not create unnecessary synonyms
Naming drift usually starts when two words mean the same thing.
Examples of bad drift:
infervsdecodemetricvsmeasureused inconsistently for the same conceptdataset_dirvsdata_rootvscorpus_dirfor the same path
Before adding a new name, search the repository and reuse the dominant term if the meaning is the same.
Avoid unnecessary abbreviations
Prefer full words unless the abbreviated form is already widely understood in the field or already well established in ESPnet3.
Good:
collect_statsinference_dirconfig_utils
Usually avoid:
prep_datacfg_loaderinfer_out
Prefer:
prepare_dataconfig_loaderinference_output
Short forms are fine when they are already common and unambiguous. stats is a good example. Ad-hoc shortening is not.
Note
If a shortened name saves only a few characters but makes the meaning less obvious, use the full word.
A practical naming workflow
Before introducing a new name:
- Search for the concept in
espnet3/andegs3/. - Reuse an existing name if one already matches.
- Check whether the name matches the expected Python style.
- Check whether the name is specific to shared code or recipe-local code.
- Verify the actual file or directory name before writing links or imports.
Examples
Good:
def load_inference_config(path: Path) -> DictConfig:
...
class InferenceProvider:
...Good:
stage_log_mapping = {
"infer": "inference_config.inference_dir",
"measure": "metrics_config.inference_dir",
}Bad:
def config_loader(path: Path) -> DictConfig:
...
class RunInference:
...Bad:
decode_dir = "exp/my_run/decode"
from .builder import MyBuilder