espnet2.speechlm.net_utils.causal_mask
Less than 1 minute
espnet2.speechlm.net_utils.causal_mask
espnet2.speechlm.net_utils.causal_mask(qlen: int, device: device) → Tensor
Generates a causal mask tensor for attention mechanisms.
A causal mask is a square matrix that allows attention to only consider previous tokens in a sequence. This is particularly useful in autoregressive models where the prediction of a token should not depend on future tokens.
- Parameters:
- qlen (int) – The length of the sequence for which to create the mask.
- device (torch.device) – The device (CPU or GPU) on which the mask will be allocated.
- Returns: A tensor of shape (1, qlen, qlen) representing the causal mask, where the lower triangular part is filled with ones and the upper triangular part is filled with zeros.
- Return type: torch.Tensor
Examples
>>> mask = causal_mask(5, torch.device('cpu'))
>>> print(mask)
tensor([[[1., 0., 0., 0., 0.],
[1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0.],
[1., 1., 1., 1., 1.]]])