espnet2.fileio.read_text.read_multi_columns_text
espnet2.fileio.read_text.read_multi_columns_text
espnet2.fileio.read_text.read_multi_columns_text(path: Path | str, return_unsplit: bool = False) → Tuple[Dict[str, List[str]], Dict[str, str] | None]
Read a text file having 2 or more columns as dict object.
This function reads a text file where each line contains a key followed by one or more values separated by whitespace. The function returns a dictionary where the keys are the first column values, and the values are lists of strings representing the remaining columns. Optionally, it can also return the unsplit raw values.
- Parameters:
- path (Union *[*Path , str ]) – The path to the text file to be read.
- return_unsplit (bool) – If True, return a second dictionary with unsplit values (default is False).
- Returns: A tuple containing:
- A dictionary where keys are the first column values and values are lists of strings for the remaining columns.
- An optional dictionary with the unsplit raw values if return_unsplit is True; otherwise, None.
- Return type: Tuple[Dict[str, List[str]], Optional[Dict[str, str]]]
- Raises:RuntimeError – If a key is duplicated in the input file.
Examples
Given a file ‘wav.scp’ with the following content: : key1 /some/path/a1.wav /some/path/a2.wav key2 /some/path/b1.wav /some/path/b2.wav /some/path/b3.wav key3 /some/path/c1.wav
>>> read_multi_columns_text('wav.scp')
{'key1': ['/some/path/a1.wav', '/some/path/a2.wav'],
'key2': ['/some/path/b1.wav', '/some/path/b2.wav',
'/some/path/b3.wav'],
'key3': ['/some/path/c1.wav']}
If return_unsplit is True:
read_multi_columns_text(‘wav.scp’, return_unsplit=True) ({‘key1’: [‘/some/path/a1.wav’, ‘/some/path/a2.wav’],
‘key2’: [‘/some/path/b1.wav’, ‘/some/path/b2.wav’, : ‘/some/path/b3.wav’],
‘key3’: [‘/some/path/c1.wav’]},
{‘key1’: ‘/some/path/a1.wav /some/path/a2.wav’, : ‘key2’: ‘/some/path/b1.wav /some/path/b2.wav /some/path/b3.wav’, ‘key3’: ‘/some/path/c1.wav’})