espnet3.components.data.dataset_module.instantiate_dataset_reference
Less than 1 minute
espnet3.components.data.dataset_module.instantiate_dataset_reference
espnet3.components.data.dataset_module.instantiate_dataset_reference(config: Mapping[str, Any] | DictConfig, recipe_dir: str | Path | None = None)
Instantiate a dataset class from a dataset entry config.
- Parameters:
- config – Dataset entry mapping. Expected keys are listed below.
data_src: optional dataset source reference.data_src_args: optional constructor kwargs forDataset.
- recipe_dir – Recipe root used when resolving local modules.
- config – Dataset entry mapping. Expected keys are listed below.
- Returns: Instantiated dataset object from
module.Dataset(**data_src_args). - Raises:
- AttributeError – If the target module does not expose
Dataset. - ModuleNotFoundError – If dataset module resolution fails.
- TypeError – If
data_src_argsis incompatible with dataset constructor.
- AttributeError – If the target module does not expose
Notes
Only data_src_args is forwarded to the dataset constructor. Top-level config fields other than data_src_args are not passed through.
Examples
>>> cfg = {"data_src": "mini_an4/asr", "data_src_args": {"split": "train"}}
>>> ds = instantiate_dataset_reference(cfg, recipe_dir="egs3/mini_an4/asr")
>>> hasattr(ds, "__len__")
TrueLocal dataset module loading.
cfg = {“data_src_args”: {“split”: “test”}} _ = instantiate_dataset_reference(cfg, recipe_dir=”egs3/mini_an4/asr”)
