espnet2.speechlm.net_utils.causal_mask

Less than 1 minute

espnet2.speechlm.net_utils.causal_mask

espnet2.speechlm.net_utils.causal_mask(qlen: int, device: device) → Tensor

Generates a causal mask tensor for attention mechanisms.

A causal mask is a square matrix that allows attention to only consider previous tokens in a sequence. This is particularly useful in autoregressive models where the prediction of a token should not depend on future tokens.

Parameters:
- qlen (int) – The length of the sequence for which to create the mask.
- device (torch.device) – The device (CPU or GPU) on which the mask will be allocated.
Returns: A tensor of shape (1, qlen, qlen) representing the causal mask, where the lower triangular part is filled with ones and the upper triangular part is filled with zeros.
Return type: torch.Tensor

Examples

>>> mask = causal_mask(5, torch.device('cpu'))
>>> print(mask)
tensor([[[1., 0., 0., 0., 0.],
         [1., 1., 0., 0., 0.],
         [1., 1., 1., 0., 0.],
         [1., 1., 1., 1., 0.],
         [1., 1., 1., 1., 1.]]])