espnet2.gan_codec.shared.encoder.seanet.ConvLayerNorm

About 1 min

espnet2.gan_codec.shared.encoder.seanet.ConvLayerNorm

class espnet2.gan_codec.shared.encoder.seanet.ConvLayerNorm(normalized_shape: int | List[int] | Size, **kwargs)

Bases: LayerNorm

Convolution-friendly LayerNorm that moves channels to the last dimensions before running the normalization and moves them back to the original position right after.

This layer is particularly useful for convolutional neural networks where normalization needs to be applied along the channel dimension.

Parameters:
- normalized_shape (Union *[*int , List *[*int ] , torch.Size ]) – Shape of the input tensor that will be normalized. This can be an integer or a list/tuple of integers specifying the size of each dimension to normalize over.
- **kwargs – Additional keyword arguments to pass to the parent LayerNorm class.
Returns: The normalized output tensor with the same shape as the input tensor.
Return type: torch.Tensor

####### Examples

>>> import torch
>>> layer_norm = ConvLayerNorm(normalized_shape=16)
>>> input_tensor = torch.randn(8, 16, 50)  # (batch_size, channels, time)
>>> output_tensor = layer_norm(input_tensor)
>>> output_tensor.shape
torch.Size([8, 16, 50])

NOTE

This implementation is optimized for use in convolutional layers where the input tensor shape is typically (batch_size, channels, time).

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Applies the convolutional layer normalization to the input tensor.

This method rearranges the input tensor to ensure that the channels are in the last dimension before applying the LayerNorm. After the normalization, it rearranges the tensor back to its original shape.

Parameters:x (torch.Tensor) – The input tensor with shape (B, C, T), where B is the batch size, C is the number of channels, and T is the length of the sequence.
Returns: The normalized output tensor with the same shape as the input tensor.
Return type: torch.Tensor

####### Examples

>>> layer_norm = ConvLayerNorm(normalized_shape=64)
>>> input_tensor = torch.randn(32, 64, 100)  # (B, C, T)
>>> output_tensor = layer_norm(input_tensor)
>>> output_tensor.shape
torch.Size([32, 64, 100])