espnet2.fileio.sound_scp.SoundScpReader
espnet2.fileio.sound_scp.SoundScpReader
class espnet2.fileio.sound_scp.SoundScpReader(fname, dtype=None, always_2d: bool = False, multi_columns: bool = False, concat_axis=1)
Bases: Mapping
Reader class for ‘wav.scp’.
This class reads a ‘wav.scp’ file which contains mappings of keys to audio file paths. It can handle both single and multi-column entries for audio files. The multi-column option allows for concatenating multiple audio files associated with a single key.
fname
The path to the ‘wav.scp’ file.
- Type: str
dtype
Data type for the audio data.
- Type: optional
always_2d
If True, ensures the output is always 2-dimensional.
- Type: bool
multi_columns
If True, enables reading of multi-column entries.
- Type: bool
concat_axis
Axis along which to concatenate audio data.
Type: int
Parameters:
- fname (str) – The path to the ‘wav.scp’ file.
- dtype (optional) – Data type for audio data (default: None).
- always_2d (bool) – Ensure output is always 2D (default: False).
- multi_columns (bool) – Enable reading of multi-column entries (default: False).
- concat_axis (int) – Axis for concatenation of audio data (default: 1).
######### Examples
wav.scp is a text file that looks like the following:
key1 /some/path/a.wav key2 /some/path/b.wav key3 /some/path/c.wav key4 /some/path/d.wav
>>> reader = SoundScpReader('wav.scp')
>>> rate, array = reader['key1']
If multi_columns=True is given and multiple files are given in one line with space delimiter, the output array is concatenated along the channel direction:
key1 /some/path/a.wav /some/path/a2.wav key2 /some/path/b.wav /some/path/b2.wav
>>> reader = SoundScpReader('wav.scp', multi_columns=True)
>>> rate, array = reader['key1']
In the above case, a.wav and a2.wav are concatenated.
Note that even if multi_columns=True is given, SoundScpReader still supports a normal wav.scp, i.e., a wav file is given per line, but this option is disabled by default because a dict[str, list[str]] object is needed to be kept, which increases the required amount of memory.
get_path(key)
Retrieve the file path associated with a given key.
This method accesses the internal data structure of the SoundScpReader class to return the file path corresponding to the specified key.
- Parameters:key (str) – The key for which to retrieve the associated file path.
- Returns: The file path or list of file paths : associated with the key. If the key corresponds to multiple files, a list of paths is returned.
- Return type: Union[str, List[str]]
- Raises:KeyError – If the key is not found in the data.
######### Examples
>>> reader = SoundScpReader('wav.scp')
>>> path = reader.get_path('key1')
>>> print(path)
/some/path/a.wav
>>> reader_multi = SoundScpReader('multi_wav.scp', multi_columns=True)
>>> paths = reader_multi.get_path('key1')
>>> print(paths)
['/some/path/a.wav', '/some/path/a2.wav']
keys()
Reader class for ‘wav.scp’.
This class provides an interface to read audio file paths from a ‘wav.scp’ file, which maps keys to their corresponding audio file paths. It supports both single and multi-column formats for handling audio files.
fname
The path to the ‘wav.scp’ file.
- Type: str
dtype
The data type for audio reading.
- Type: Optional[str]
always_2d
If True, ensures that the returned audio arrays are always 2D.
- Type: bool
multi_columns
If True, allows multiple audio files to be specified in one line, separated by spaces.
- Type: bool
concat_axis
The axis along which to concatenate audio arrays if multi-columns is used.
Type: int
Parameters:
- fname (str) – The path to the ‘wav.scp’ file.
- dtype (Optional *[*str ]) – The data type for audio reading.
- always_2d (bool) – If True, ensures that the returned audio arrays are always 2D. Default is False.
- multi_columns (bool) – If True, allows multiple audio files to be specified in one line, separated by spaces. Default is False.
- concat_axis (int) – The axis along which to concatenate audio arrays if multi-columns is used. Default is 1.
######### Examples
wav.scp is a text file that looks like the following:
key1 /some/path/a.wav key2 /some/path/b.wav key3 /some/path/c.wav key4 /some/path/d.wav …
>>> reader = SoundScpReader('wav.scp')
>>> rate, array = reader['key1']
If multi_columns=True is given and multiple files are given in one line with space delimiter, the output array will be concatenated along the channel direction:
key1 /some/path/a.wav /some/path/a2.wav key2 /some/path/b.wav /some/path/b2.wav …
>>> reader = SoundScpReader('wav.scp', multi_columns=True)
>>> rate, array = reader['key1']
In the above case, a.wav and a2.wav are concatenated.
Note that even if multi_columns=True is given, SoundScpReader still supports a normal wav.scp, i.e., a wav file is given per line, but this option is disabled by default because a dict[str, list[str]] object is needed to be kept, which increases the required amount of memory.