espnet2.layers.utterance_mvn.UtteranceMVN

About 1 min

espnet2.layers.utterance_mvn.UtteranceMVN

class espnet2.layers.utterance_mvn.UtteranceMVN(norm_means: bool = True, norm_vars: bool = False, eps: float = 1e-20)

Bases: AbsNormalize

UtteranceMVN is a normalization layer that applies mean and variance

normalization to input tensors, typically used in speech processing.

This class inherits from AbsNormalize and provides functionality to normalize the means and variances of utterances in a batch.

norm_means

If True, normalize the means of the input tensors.

Type: bool

norm_vars

If True, normalize the variances of the input tensors.

Type: bool

eps

A small value to prevent division by zero during normalization.

Type: float
Parameters:
- norm_means (bool) – Whether to normalize the means. Default is True.
- norm_vars (bool) – Whether to normalize the variances. Default is False.
- eps (float) – A small constant for numerical stability. Default is 1.0e-20.

####### Examples

>>> layer = UtteranceMVN(norm_means=True, norm_vars=False)
>>> x = torch.randn(5, 10, 20)  # Batch of 5, 10 time steps, 20 features
>>> ilens = torch.tensor([10, 10, 10, 10, 10])  # All utterances are valid
>>> normalized_x, ilens = layer(x, ilens)

Initialize internal Module state, shared by both nn.Module and ScriptModule.

extra_repr()

Returns a string representation of the UtteranceMVN instance, including its

attributes.

This method provides a summary of the normalization settings used in the UtteranceMVN class, specifically whether means and/or variances are being normalized.

norm_means

Indicates if the means should be normalized.

Type: bool

norm_vars

Indicates if the variances should be normalized.

Type: bool
Returns: A formatted string summarizing the normalization configuration.
Return type: str

####### Examples

>>> mvn = UtteranceMVN(norm_means=True, norm_vars=False)
>>> print(mvn.extra_repr())
norm_means=True, norm_vars=False

>>> mvn2 = UtteranceMVN(norm_means=False, norm_vars=True)
>>> print(mvn2.extra_repr())
norm_means=False, norm_vars=True

forward(x: Tensor, ilens: Tensor | None = None) → Tuple[Tensor, Tensor]

Forward function

Parameters:
- x – (B, L, …)
- ilens – (B,)