espnet2.enh.layers.ncsnpp_utils.normalization.ConditionalNoneNorm2d

About 2 min

espnet2.enh.layers.ncsnpp_utils.normalization.ConditionalNoneNorm2d

class espnet2.enh.layers.ncsnpp_utils.normalization.ConditionalNoneNorm2d(num_features, num_classes, bias=True)

Bases: Module

Conditional None Normalization Layer.

This layer applies a conditional normalization technique that allows the scaling of input features based on class embeddings. The output is computed by scaling the input tensor with learned parameters depending on the provided class index.

num_features

The number of features in the input tensor.

Type: int

bias

A flag indicating whether to use a bias term.

Type: bool

embed

An embedding layer that maps class indices to scaling parameters.

Type: nn.Embedding
Parameters:
- num_features (int) – Number of input features (channels).
- num_classes (int) – Number of classes for embedding.
- bias (bool , optional) – If True, includes a bias term in the normalization. Defaults to True.
Returns: The normalized output tensor, scaled by the parameters derived from the input class index.
Return type: Tensor

####### Examples

>>> import torch
>>> layer = ConditionalNoneNorm2d(num_features=64, num_classes=10)
>>> x = torch.randn(8, 64, 32, 32)  # Batch of 8 images
>>> y = torch.randint(0, 10, (8,))  # Random class indices for batch
>>> output = layer(x, y)
>>> print(output.shape)  # Should be (8, 64, 32, 32)

NOTE

This layer is primarily used in scenarios where no specific normalization is desired but conditional scaling based on class information is still beneficial.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x, y)

Applies the Conditional None Normalization to the input tensor.

This method normalizes the input tensor x based on the class embeddings obtained from the input tensor y. If the bias attribute is set to True, it uses two separate embeddings for scale (gamma) and bias (beta). If bias is False, it only applies the scale.

Parameters:
- x (torch.Tensor) – Input tensor of shape (N, C, H, W), where N is the batch size, C is the number of features, H is the height, and W is the width.
- y (torch.Tensor) – Class indices of shape (N,) used to index the embedding layer.
Returns: The output tensor after applying Conditional None Normalization, with the same shape as the input tensor x.
Return type: torch.Tensor

####### Examples

>>> model = ConditionalNoneNorm2d(num_features=64, num_classes=10)
>>> x = torch.randn(8, 64, 32, 32)  # Batch of 8 images
>>> y = torch.randint(0, 10, (8,))  # Random class indices
>>> output = model(x, y)
>>> print(output.shape)  # Should be torch.Size([8, 64, 32, 32])

NOTE

This normalization technique does not change the input tensor shape, and it is particularly useful for tasks where conditioning on class information is necessary.