espnet2.layers.mask_along_axis.MaskAlongAxis
espnet2.layers.mask_along_axis.MaskAlongAxis
class espnet2.layers.mask_along_axis.MaskAlongAxis(mask_width_range: int | Sequence[int] = (0, 30), num_mask: int = 2, dim: int | str = 'time', replace_with_zero: bool = True)
Bases: Module
Mask input tensor along a specified axis with random masking.
This class provides functionality to apply random masking to input tensors, allowing for more robust training of models by simulating missing data.
mask_width_range
The range of widths for the masks. Can be a single integer or a tuple defining the minimum and maximum width.
- Type: Union[int, Sequence[int]]
num_mask
The number of masks to apply to the input tensor.
- Type: int
dim
The dimension along which to apply the mask. Can be specified as an integer (1 for time, 2 for frequency) or as a string.
- Type: Union[int, str]
replace_with_zero
Whether to replace the masked values with zeros or with the mean of the tensor.
Type: bool
Parameters:
- mask_width_range (Union *[*int , Sequence *[*int ] ]) – The range of widths for the masks.
- num_mask (int) – The number of masks to apply.
- dim (Union *[*int , str ]) – The dimension along which to mask (‘time’ or ‘freq’).
- replace_with_zero (bool) – Flag to determine the value used to replace masked elements.
####### Examples
>>> mask_layer = MaskAlongAxis(mask_width_range=(0, 20), num_mask=3)
>>> masked_spec, lengths = mask_layer(spec_tensor, spec_lengths)
- Raises:
- TypeError – If mask_width_range is not a tuple of two integers.
- ValueError – If dim is not an integer or one of the specified strings.
NOTE
The masking is performed in-place for tensors that do not require gradients.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
extra_repr()
Returns a string representation of the MaskAlongAxis module’s attributes.
The string representation includes the mask width range, the number of masks, and the masking axis used in the masking process.
mask_width_range
The range of mask widths.
- Type: Union[int, Sequence[int]]
num_mask
The number of masks to apply.
- Type: int
mask_axis
The axis along which the masking is applied, either “time” or “freq”.
Type: str
Returns: A formatted string representation of the module’s parameters.
Return type: str
####### Examples
>>> mask_layer = MaskAlongAxis(mask_width_range=(5, 10), num_mask=3)
>>> print(mask_layer.extra_repr())
mask_width_range=(5, 10), num_mask=3, axis=time
forward(spec: Tensor, spec_lengths: Tensor | None = None)
Forward function.
- Parameters:spec – (Batch, Length, Freq)