espnet2.enh.layers.ncsnpp_utils.layers.ConvMeanPool
espnet2.enh.layers.ncsnpp_utils.layers.ConvMeanPool
class espnet2.enh.layers.ncsnpp_utils.layers.ConvMeanPool(input_dim, output_dim, kernel_size=3, biases=True, adjust_padding=False)
Bases: Module
Convolutional layer followed by mean pooling.
This layer applies a convolution operation followed by mean pooling on the input tensor. The mean pooling is done over the four quadrants of the input, effectively downsampling the feature map while preserving the learned features.
conv
The convolutional layer used to process the input.
Type: nn.Module
Parameters:
- input_dim (int) – The number of input channels.
- output_dim (int) – The number of output channels.
- kernel_size (int , optional) – The size of the convolutional kernel. Defaults to 3.
- biases (bool , optional) – Whether to include a bias term in the convolution. Defaults to True.
- adjust_padding (bool , optional) – If True, applies zero padding before the convolution to maintain the spatial dimensions. Defaults to False.
####### Examples
>>> layer = ConvMeanPool(input_dim=3, output_dim=16, kernel_size=3)
>>> input_tensor = torch.randn(1, 3, 32, 32) # Batch size of 1
>>> output_tensor = layer(input_tensor)
>>> output_tensor.shape
torch.Size([1, 16, 16, 16]) # Output dimensions after pooling
NOTE
If adjust_padding is set to True, the convolution will be applied with an additional padding of 1 pixel on the left and top sides.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(inputs)
Forward pass for the Conditional Residual Block.
This method applies the forward operation on the input tensor x and the conditional tensor y. It normalizes the input, applies a series of convolutions, and returns the sum of the shortcut connection and the output of the convolutions.
- Parameters:
- x (torch.Tensor) – Input tensor of shape (B, C, H, W), where B is the batch size, C is the number of channels, H is the height, and W is the width.
- y (torch.Tensor) – Conditional tensor of shape (B, num_classes).
- Returns: Output tensor of the same shape as input x.
- Return type: torch.Tensor
NOTE
The number of channels in x must match input_dim, and the number of classes in y must match num_classes.
####### Examples
>>> block = ConditionalResidualBlock(input_dim=64, output_dim=128,
... num_classes=10)
>>> x = torch.randn(8, 64, 32, 32) # Batch of 8 images
>>> y = torch.randint(0, 10, (8,)) # Batch of class labels
>>> output = block(x, y)
>>> print(output.shape)
torch.Size([8, 128, 32, 32])