espnet2.fileio.sound_scp.SoundScpReader

About 3 min

espnet2.fileio.sound_scp.SoundScpReader

class espnet2.fileio.sound_scp.SoundScpReader(fname, dtype=None, always_2d: bool = False, multi_columns: bool = False, concat_axis=1)

Bases: Mapping

Reader class for ‘wav.scp’.

This class reads a ‘wav.scp’ file which contains mappings of keys to audio file paths. It can handle both single and multi-column entries for audio files. The multi-column option allows for concatenating multiple audio files associated with a single key.

fname

The path to the ‘wav.scp’ file.

Type: str

dtype

Data type for the audio data.

Type: optional

always_2d

If True, ensures the output is always 2-dimensional.

Type: bool

multi_columns

If True, enables reading of multi-column entries.

Type: bool

concat_axis

Axis along which to concatenate audio data.

Type: int
Parameters:
- fname (str) – The path to the ‘wav.scp’ file.
- dtype (optional) – Data type for audio data (default: None).
- always_2d (bool) – Ensure output is always 2D (default: False).
- multi_columns (bool) – Enable reading of multi-column entries (default: False).
- concat_axis (int) – Axis for concatenation of audio data (default: 1).

######### Examples

wav.scp is a text file that looks like the following:

key1 /some/path/a.wav key2 /some/path/b.wav key3 /some/path/c.wav key4 /some/path/d.wav

>>> reader = SoundScpReader('wav.scp')
>>> rate, array = reader['key1']

If multi_columns=True is given and multiple files are given in one line with space delimiter, the output array is concatenated along the channel direction:

key1 /some/path/a.wav /some/path/a2.wav key2 /some/path/b.wav /some/path/b2.wav

>>> reader = SoundScpReader('wav.scp', multi_columns=True)
>>> rate, array = reader['key1']

In the above case, a.wav and a2.wav are concatenated.

Note that even if multi_columns=True is given, SoundScpReader still supports a normal wav.scp, i.e., a wav file is given per line, but this option is disabled by default because a dict[str, list[str]] object is needed to be kept, which increases the required amount of memory.

get_path(key)

Retrieve the file path associated with a given key.

This method accesses the internal data structure of the SoundScpReader class to return the file path corresponding to the specified key.

Parameters:key (str) – The key for which to retrieve the associated file path.
Returns: The file path or list of file paths : associated with the key. If the key corresponds to multiple files, a list of paths is returned.
Return type: Union[str, List[str]]
Raises:KeyError – If the key is not found in the data.

######### Examples

>>> reader = SoundScpReader('wav.scp')
>>> path = reader.get_path('key1')
>>> print(path)
/some/path/a.wav

>>> reader_multi = SoundScpReader('multi_wav.scp', multi_columns=True)
>>> paths = reader_multi.get_path('key1')
>>> print(paths)
['/some/path/a.wav', '/some/path/a2.wav']

keys()

Reader class for ‘wav.scp’.

This class provides an interface to read audio file paths from a ‘wav.scp’ file, which maps keys to their corresponding audio file paths. It supports both single and multi-column formats for handling audio files.

fname

The path to the ‘wav.scp’ file.

Type: str

dtype

The data type for audio reading.

Type: Optional[str]

always_2d

If True, ensures that the returned audio arrays are always 2D.

Type: bool

multi_columns

If True, allows multiple audio files to be specified in one line, separated by spaces.

Type: bool

concat_axis

The axis along which to concatenate audio arrays if multi-columns is used.

Type: int
Parameters:
- fname (str) – The path to the ‘wav.scp’ file.
- dtype (Optional *[*str ]) – The data type for audio reading.
- always_2d (bool) – If True, ensures that the returned audio arrays are always 2D. Default is False.
- multi_columns (bool) – If True, allows multiple audio files to be specified in one line, separated by spaces. Default is False.
- concat_axis (int) – The axis along which to concatenate audio arrays if multi-columns is used. Default is 1.

######### Examples

wav.scp is a text file that looks like the following:

key1 /some/path/a.wav key2 /some/path/b.wav key3 /some/path/c.wav key4 /some/path/d.wav …

>>> reader = SoundScpReader('wav.scp')
>>> rate, array = reader['key1']

If multi_columns=True is given and multiple files are given in one line with space delimiter, the output array will be concatenated along the channel direction:

key1 /some/path/a.wav /some/path/a2.wav key2 /some/path/b.wav /some/path/b2.wav …

>>> reader = SoundScpReader('wav.scp', multi_columns=True)
>>> rate, array = reader['key1']

In the above case, a.wav and a2.wav are concatenated.