espnet2.fileio.rttm.load_rttm_text
Less than 1 minute
espnet2.fileio.rttm.load_rttm_text
espnet2.fileio.rttm.load_rttm_text(path: Path | str) → Dict[str, List[Tuple[str, float, float]]]
Read a RTTM (Rich Transcription Time Marked) file and extract speaker
information.
This function reads a RTTM file and organizes the speaker annotations into a structured dictionary. The dictionary maps utterance IDs to a list of tuples containing speaker IDs and their corresponding start and end times.
Note: This function currently only supports speaker information.
- Parameters:path (Union *[*Path , str ]) – The file path to the RTTM file to be read.
- Returns: A dictionary where the keys are utterance IDs and the values are lists of tuples. Each tuple contains the speaker ID, start time, and end time for each speaker in the utterance.
- Return type: Dict[str, List[Tuple[str, float, float]]]
- Raises:
- AssertionError – If the line in the RTTM file does not contain exactly
- 9 fields or if the label type is not "SPEAKER" or "END". –
Examples
>>> rttm_data = load_rttm_text('path/to/rttm/file.rttm')
>>> print(rttm_data)
{
'file1': [
('spk1', 0, 1023),
('spk2', 4000, 3023),
('spk1', 500, 4023)
]
}