espnet2.diar.layers.tcn_nomask.ChannelwiseLayerNorm

About 2 min

espnet2.diar.layers.tcn_nomask.ChannelwiseLayerNorm

class espnet2.diar.layers.tcn_nomask.ChannelwiseLayerNorm(channel_size)

Bases: Module

Channel-wise Layer Normalization (cLN).

This layer normalizes the input across the channel dimension. It uses learnable parameters gamma and beta to scale and shift the normalized output. This is particularly useful in various neural network architectures to stabilize the learning process.

gamma

A learnable parameter for scaling, initialized to 1.

beta

A learnable parameter for shifting, initialized to 0.

Parameters:channel_size (int) – The number of channels in the input tensor.

reset_parameters()

Resets the parameters gamma and beta to their initial values.

forward()

Applies channel-wise normalization to the input tensor.

######### Examples

>>> layer_norm = ChannelwiseLayerNorm(channel_size=64)
>>> input_tensor = torch.randn(32, 64, 100)  # [M, N, K]
>>> output_tensor = layer_norm(input_tensor)
>>> print(output_tensor.shape)  # Output: torch.Size([32, 64, 100])

NOTE

The input tensor must have three dimensions: (batch_size, channel_size, sequence_length).

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(y)

Forward pass for the TemporalConvNet.

This method takes a mixture of audio signals as input and passes it through the temporal convolutional network, returning the bottleneck features.

Parameters:mixture_w – A tensor of shape [M, N, K], where: M is the batch size, N is the number of input channels, K is the sequence length.
Returns: A tensor of shape [M, B, K], where: : B is the number of bottleneck channels.
Return type: bottleneck_feature

######### Examples

>>> model = TemporalConvNet(N=64, B=16, H=32, P=3, X=4, R=2)
>>> mixture = torch.randn(8, 64, 100)  # Batch of 8 samples
>>> output = model(mixture)
>>> print(output.shape)
torch.Size([8, 16, 100])

reset_parameters()

Channel-wise Layer Normalization (cLN).

This class implements channel-wise layer normalization, which normalizes the input tensor along the channel dimension. The normalization is achieved by computing the mean and variance for each channel across the batch and sequence dimensions. The normalized output is scaled and shifted using learnable parameters gamma and beta.

gamma

Learnable scaling parameter of shape (1, N, 1).

Type: nn.Parameter

beta

Learnable shifting parameter of shape (1, N, 1).

Type: nn.Parameter
Parameters:channel_size (int) – The number of channels (N) in the input tensor.

######### Examples

>>> layer_norm = ChannelwiseLayerNorm(channel_size=64)
>>> input_tensor = torch.randn(32, 64, 100)  # [M, N, K]
>>> output_tensor = layer_norm(input_tensor)  # Output shape: [32, 64, 100]

NOTE

The normalization is applied to the input tensor using the formula: cLN_y = gamma * (y - mean) / sqrt(var + EPS) + beta, where mean and var are computed for each channel.