espnet3.components.data.dataset_module.parse_dataset_reference_config
Less than 1 minute
espnet3.components.data.dataset_module.parse_dataset_reference_config
espnet3.components.data.dataset_module.parse_dataset_reference_config(config: Mapping[str, Any] | DictConfig)
Extract dataset source fields from one dataset config entry.
- Parameters:config – Dataset entry mapping. Expected keys are listed below.
data_src: optional dataset source reference.data_src_args: optional constructor kwargs forDataset.
- Returns: Tuple of normalized
(data_src, data_src_args). - Raises:
- TypeError – If
data_srcis neither a string norNone. - ValueError – If
data_src_argsis incompatible withdict(...).
- TypeError – If
Notes
Only data_src_args is forwarded to the dataset constructor. Top-level config fields such as name and transform are ignored here.
Examples
>>> parse_dataset_reference_config(
... {"data_src": "mini_an4/asr", "data_src_args": {"split": "train"}}
... )
('mini_an4/asr', {'split': 'train'})
>>> parse_dataset_reference_config({"data_src_args": {"split": "test"}})
(None, {'split': 'test'})