espnet2.asr_transducer.utils.make_chunk_mask
Less than 1 minute
espnet2.asr_transducer.utils.make_chunk_mask
espnet2.asr_transducer.utils.make_chunk_mask(size: int, chunk_size: int, num_left_chunks: int = 0, device: device | None = None) → Tensor
Create a chunk mask for the subsequent steps.
This function generates a boolean mask tensor indicating which frames can be attended to based on the chunking strategy defined by the chunk_size and num_left_chunks. The resulting mask tensor has a shape of (size, size), where size is the total number of frames.
Reference: : https://github.com/k2-fsa/icefall/blob/master/icefall/utils.py
- Parameters:
- size – Size of the source mask.
- chunk_size – Number of frames in each chunk.
- num_left_chunks – Number of left chunks that the attention module can see. A null or negative value means full context.
- device – The device for the mask tensor (e.g., ‘cpu’ or ‘cuda’).
- Returns: A boolean tensor of shape (size, size) representing : the chunk mask, where True indicates frames that can be attended to, and False indicates masked frames.
- Return type: mask
Examples
>>> mask = make_chunk_mask(size=10, chunk_size=3, num_left_chunks=1)
>>> print(mask)
tensor([[False, True, True, False, False, False, False, False, False, False],
[ True, True, True, False, False, False, False, False, False, False],
[False, True, True, False, False, False, False, False, False, False],
[False, False, False, True, True, True, False, False, False, False],
[False, False, False, True, True, True, False, False, False, False],
...
])