espnet2.fileio.vad_scp.VADScpReader
espnet2.fileio.vad_scp.VADScpReader
class espnet2.fileio.vad_scp.VADScpReader(fname, dtype=<class 'numpy.float32'>)
Bases: Mapping
Reader class for ‘vad.scp’.
This class provides functionality to read a ‘vad.scp’ file, which focuses on utterance-level voice activity detection (VAD) segments. Unlike the segments file, which encompasses entire sessions, the vad.scp file is designed to guide silence trimming for UASR (Unsupervised Automatic Speech Recognition).
fname
The file name of the ‘vad.scp’ file to read.
- Type: str
dtype
The data type for the VAD segments, defaulting to np.float32.
- Type: numpy.dtype
data
A dictionary mapping keys to VAD segments.
Type: dict
Parameters:
- fname (str) – Path to the ‘vad.scp’ file.
- dtype (numpy.dtype , optional) – Data type for VAD segments.
Returns: A mapping of keys to their respective VAD segments.
Return type: dict
####### Examples
>>> reader = VADScpReader('vad.scp')
>>> array = reader['key1']
# array will contain the VAD segments for 'key1' as a list of tuples.
- Raises:
- KeyError – If the key is not found in the ‘vad.scp’ file.
- ValueError – If the VAD segment format is invalid.
keys()
Reader class for ‘vad.scp’.
This class reads a VAD (Voice Activity Detection) script file, which is used to guide the silence trimming for UASR (Unsupervised Automatic Speech Recognition). Unlike the segments, which focus on whole sessions, the vad.scp file focuses on utterance-level information.
fname
The filename of the VAD script file.
- Type: str
dtype
The data type for the VAD values.
- Type: numpy.dtype
data
A dictionary containing the VAD data parsed from the file.
Type: dict
Parameters:
- fname (str) – Path to the ‘vad.scp’ file to read.
- dtype (numpy.dtype , optional) – The data type for VAD values. Defaults to np.float32.
Returns: A list of tuples representing the VAD intervals for a given key.
Return type: List[Tuple[float, float]]
####### Examples
key1 0:1.2000 key2 3.0000:4.5000 7.0000:9:0000 …
>>> reader = VADScpReader('vad.scp')
>>> array = reader['key1']
- Raises:KeyError – If the specified key does not exist in the VAD data.
NOTE
The vad.scp file format expects each line to contain a key followed by one or more time intervals, separated by spaces. Each interval is represented as start:end.