espnet2.layers.mask_along_axis.mask_along_axis

Less than 1 minute

espnet2.layers.mask_along_axis.mask_along_axis

espnet2.layers.mask_along_axis.mask_along_axis(spec: Tensor, spec_lengths: Tensor, mask_width_range: Sequence[int] = (0, 30), dim: int = 1, num_mask: int = 2, replace_with_zero: bool = True)

Apply a mask along a specified axis of a tensor, randomly masking out

portions of the input tensor based on the specified parameters.

This function generates a random mask for the specified dimension of the input tensor and applies it, either replacing the masked values with zeros or the mean of the tensor, depending on the replace_with_zero parameter.

Parameters:
- spec (torch.Tensor) – Input tensor of shape (Batch, Length, Freq).
- spec_lengths (torch.Tensor) – Lengths of the input sequences, not used in this implementation. Shape: (Length).
- mask_width_range (Sequence *[*int ] , optional) – A tuple specifying the minimum and maximum width of the mask. The width is chosen randomly from this range. Defaults to (0, 30).
- dim (int , optional) – The dimension along which to apply the mask. Defaults to 1 (Length).
- num_mask (int , optional) – The number of masks to apply. Defaults to 2.
- replace_with_zero (bool , optional) – If True, masked values will be replaced with zeros. If False, masked values will be replaced with the mean of the input tensor. Defaults to True.
Returns: A tuple containing: : - torch.Tensor: The masked tensor of shape (Batch, Length, Freq).
- torch.Tensor: The original spec_lengths tensor.
Return type: tuple

Examples

>>> import torch
>>> spec = torch.rand(4, 10, 20)  # Batch of 4, Length 10, Freq 20
>>> spec_lengths = torch.tensor([10, 10, 10, 10])
>>> masked_spec, lengths = mask_along_axis(spec, spec_lengths)
>>> masked_spec.shape
torch.Size([4, 10, 20])

Raises:
- ValueError – If dim is not 1 or 2.
- TypeError – If mask_width_range is not a tuple of two integers.