espnet2.speechlm.net_utils.length_mask

Less than 1 minute

espnet2.speechlm.net_utils.length_mask

espnet2.speechlm.net_utils.length_mask(lengths: Tensor, maxlen: int | None = None) → Tensor

Creates a length mask tensor based on input lengths.

The length mask is a binary tensor that indicates which positions in a sequence are valid based on the given lengths. This is useful for masking out padding tokens in batch processing of variable-length sequences.

Parameters:
- lengths (torch.Tensor) – A 1D tensor containing the lengths of each sequence in the batch.
- maxlen (int , optional) – The maximum length of sequences. If not provided, it defaults to the maximum value in lengths.
Returns: A 2D binary tensor of shape (num_sequences, maxlen), where each row contains 1s for valid positions and 0s for positions beyond the length of the corresponding sequence.
Return type: torch.Tensor

Examples

>>> lengths = torch.tensor([3, 5, 2])
>>> mask = length_mask(lengths)
>>> print(mask)
tensor([[1, 1, 1, 0, 0],
        [1, 1, 1, 1, 1],
        [1, 1, 0, 0, 0]])

NOTE

The output mask is generated using broadcasting to efficiently create the desired shape.

Raises:AssertionError – If lengths is not a 1D tensor.