espnet2.samplers.unsorted_batch_sampler.UnsortedBatchSampler
espnet2.samplers.unsorted_batch_sampler.UnsortedBatchSampler
class espnet2.samplers.unsorted_batch_sampler.UnsortedBatchSampler(batch_size: int, key_file: str, drop_last: bool = False, utt2category_file: str | None = None)
Bases: AbsSampler
UnsortedBatchSampler is a BatchSampler that generates batches of a constant size without performing any sorting. It is particularly useful in decoding mode or for tasks that do not involve sequence-to-sequence learning, such as classification.
This class does not require length information as it directly uses the keys from the provided key file to create batches.
batch_size
The size of each batch.
- Type: int
key_file
The path to the key file containing the utterances.
- Type: str
drop_last
Whether to drop the last incomplete batch.
- Type: bool
batch_list
A list of batches created from the key file.
Type: list
Parameters:
- batch_size (int) – The size of each batch. Must be greater than 0.
- key_file (str) – The path to the key file.
- drop_last (bool , optional) – If True, drop the last incomplete batch. Defaults to False.
- utt2category_file (str , optional) – An optional file mapping utterances to categories. If provided, must match the keys in the key file.
Raises:RuntimeError – If the key file is empty or if there is a mismatch between the keys in the key file and the categories in the utt2category_file.
Examples
>>> sampler = UnsortedBatchSampler(batch_size=2, key_file='keys.txt')
>>> for batch in sampler:
... print(batch)
NOTE
The keys in the key file should be formatted such that each line contains an utterance key. If a category file is provided, it should also have the same keys for proper mapping.