espnet3.systems.base.scp_utils.load_scp_fields
Less than 1 minute
espnet3.systems.base.scp_utils.load_scp_fields
espnet3.systems.base.scp_utils.load_scp_fields(decode_dir: Path, test_name: str, inputs: List[str] | Dict[str, str], file_suffix: str = '.scp') → Dict[str, List[str]]
Load and align SCP files into a field-wise dictionary for evaluation.
This function reads all required SCP files from the given test set, validates their consistency, and returns a dictionary with:
- “utt_id”: list of sorted utterance IDs
- <alias>: list of aligned values for each key (e.g., “ref”, “hyp”)
Parameters:
- decode_dir (Path) – Root directory of decode results.
- test_name (str) – Subdirectory name of the test set (e.g., “test-clean”).
- inputs (Union *[*List *[*str ] , Dict *[*str , str ] ]) – SCP keys to read.
- List[str]: alias and filename are the same.
- Dict[str, str]: alias → filename.
- file_suffix (str , optional) – SCP file extension. Default: “.scp”
Returns: { : “utt_id”: […], “ref”: […], “hyp”: […], …
}
Return type: Dict[str, List[str]]
Raises:AssertionError – If SCP files are missing or utterance IDs are inconsistent.
Example
>>> load_scp_fields(
... Path("decode"),
... "test-other",
... inputs={"ref": "text", "hyp": "hypothesis"},
... )
{
"utt_id": ["utt1", "utt2"],
"ref": ["the cat", "a dog"],
"hyp": ["the bat", "a log"]
}Notes
- Utterance IDs are sorted to ensure consistent alignment.
- Useful as direct input to AbsMetrics._call_().
