espnet2.speechlm.dataloader.multimodal_loader.audio_loader.LhotseAudioReader

Less than 1 minute

Bases: object

Dict-like lazy audio reader using Lhotse manifests.

This reader supports both single-channel and multi-channel audio data:

The output shape is consistent regardless of the input type, always returning a 2D array with shape [num_channels, num_samples].

Parameters:
- manifest_dir – Directory containing Lhotse manifest files (recordings.jsonl.gz and optionally cuts.jsonl.gz)
- valid_ids – List of valid IDs to keep (optional, keeps all if None)

items()

Return iterator over (id, item) pairs.

keys()

Return iterator over IDs.

values()

Return iterator over items.