espnet2.fileio.rttm.load_rttm_text

Less than 1 minute

espnet2.fileio.rttm.load_rttm_text

espnet2.fileio.rttm.load_rttm_text(path: Path | str) → Dict[str, List[Tuple[str, float, float]]]

Read a RTTM (Rich Transcription Time Marked) file and extract speaker

information.

This function reads a RTTM file and organizes the speaker annotations into a structured dictionary. The dictionary maps utterance IDs to a list of tuples containing speaker IDs and their corresponding start and end times.

Note: This function currently only supports speaker information.

Parameters:path (Union *[*Path , str ]) – The file path to the RTTM file to be read.
Returns: A dictionary where the keys are utterance IDs and the values are lists of tuples. Each tuple contains the speaker ID, start time, and end time for each speaker in the utterance.
Return type: Dict[str, List[Tuple[str, float, float]]]
Raises:
- AssertionError – If the line in the RTTM file does not contain exactly
- 9 fields or if the label type is not "SPEAKER" or "END". –

Examples

>>> rttm_data = load_rttm_text('path/to/rttm/file.rttm')
>>> print(rttm_data)
{
    'file1': [
        ('spk1', 0, 1023),
        ('spk2', 4000, 3023),
        ('spk1', 500, 4023)
    ]
}