espnet2.enh.layers.ncsnpp_utils.up_or_down_sampling.Conv2d

About 2 min

espnet2.enh.layers.ncsnpp_utils.up_or_down_sampling.Conv2d

class espnet2.enh.layers.ncsnpp_utils.up_or_down_sampling.Conv2d(in_ch, out_ch, kernel, up=False, down=False, resample_kernel=(1, 3, 3, 1), use_bias=True, kernel_init=None)

Bases: Module

Conv2d layer with optimal upsampling and downsampling (StyleGAN2).

This class implements a 2D convolutional layer that supports both upsampling and downsampling operations, inspired by the methods used in StyleGAN2. The layer is designed to be efficient and flexible, allowing for the application of various kernel sizes and resampling techniques.

weight

The learnable weight parameter for the convolution operation.

Type: torch.nn.Parameter

bias

The learnable bias parameter for the convolution operation, if use_bias is True.

Type: torch.nn.Parameter, optional

Indicates whether the layer performs upsampling.

Type: bool

down

Indicates whether the layer performs downsampling.

Type: bool

resample_kernel

The kernel used for resampling during upsampling/downsampling.

Type: tuple

kernel

The size of the convolution kernel.

Type: int

use_bias

Flag to indicate if bias should be used.

Type: bool
Parameters:
- in_ch (int) – Number of input channels.
- out_ch (int) – Number of output channels.
- kernel (int) – Size of the convolution kernel (must be odd and >= 1).
- up (bool , optional) – If True, the layer will perform upsampling. Default is False.
- down (bool , optional) – If True, the layer will perform downsampling. Default is False.
- resample_kernel (tuple , optional) – Kernel for resampling. Default is (1, 3, 3, 1).
- use_bias (bool , optional) – If True, a bias term will be added to the output. Default is True.
- kernel_init (callable , optional) – Function to initialize the kernel weights.
Returns: The output tensor after applying the convolution, upsampling, or downsampling.
Return type: torch.Tensor
Raises:AssertionError – If both up and down are set to True, or if kernel size is not valid.

####### Examples

>>> conv_layer = Conv2d(in_ch=3, out_ch=64, kernel=3, up=True)
>>> input_tensor = torch.randn(1, 3, 64, 64)
>>> output_tensor = conv_layer(input_tensor)
>>> print(output_tensor.shape)
torch.Size([1, 64, 128, 128])

>>> conv_layer = Conv2d(in_ch=3, out_ch=64, kernel=3, down=True)
>>> output_tensor = conv_layer(input_tensor)
>>> print(output_tensor.shape)
torch.Size([1, 64, 32, 32])

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Conv2d layer with optimal upsampling and downsampling (StyleGAN2).

This class implements a convolutional layer that can perform upsampling, downsampling, or standard convolution based on the parameters provided. It uses optimized techniques inspired by the StyleGAN2 architecture for improved performance.

weight

The learnable weight tensor for the convolution.

Type: nn.Parameter

bias

The learnable bias tensor for the convolution.

Type: nn.Parameter, optional

Flag to indicate if upsampling should be performed.

Type: bool

down

Flag to indicate if downsampling should be performed.

Type: bool

resample_kernel

The kernel used for resampling.

Type: tuple

kernel

The size of the convolution kernel.

Type: int

use_bias

Flag to indicate if bias should be used in the convolution.

Type: bool
Parameters:
- in_ch (int) – Number of input channels.
- out_ch (int) – Number of output channels.
- kernel (int) – Size of the convolution kernel (must be odd).
- up (bool , optional) – If True, the layer will perform upsampling. Defaults to False.
- down (bool , optional) – If True, the layer will perform downsampling. Defaults to False.
- resample_kernel (tuple , optional) – FIR filter kernel for resampling. Defaults to (1, 3, 3, 1).
- use_bias (bool , optional) – If True, adds a bias term. Defaults to True.
- kernel_init (callable , optional) – A function to initialize the kernel weights. Defaults to None.
Raises:AssertionError – If both up and down are set to True or if the kernel size is not valid.

####### Examples

>>> conv_layer = Conv2d(in_ch=3, out_ch=16, kernel=3, up=True)
>>> input_tensor = torch.randn(1, 3, 64, 64)  # Batch of 3-channel images
>>> output_tensor = conv_layer(input_tensor)
>>> print(output_tensor.shape)  # Output shape after upsampling

NOTE

This implementation is inspired by the StyleGAN2 architecture, and aims to provide efficient upsampling and downsampling operations.