espnet2.enh.layers.ncsnpp_utils.up_or_down_sampling.Conv2d
espnet2.enh.layers.ncsnpp_utils.up_or_down_sampling.Conv2d
class espnet2.enh.layers.ncsnpp_utils.up_or_down_sampling.Conv2d(in_ch, out_ch, kernel, up=False, down=False, resample_kernel=(1, 3, 3, 1), use_bias=True, kernel_init=None)
Bases: Module
Conv2d layer with optimal upsampling and downsampling (StyleGAN2).
This class implements a 2D convolutional layer that supports both upsampling and downsampling operations, inspired by the methods used in StyleGAN2. The layer is designed to be efficient and flexible, allowing for the application of various kernel sizes and resampling techniques.
weight
The learnable weight parameter for the convolution operation.
- Type: torch.nn.Parameter
bias
The learnable bias parameter for the convolution operation, if use_bias is True.
- Type: torch.nn.Parameter, optional
up
Indicates whether the layer performs upsampling.
- Type: bool
down
Indicates whether the layer performs downsampling.
- Type: bool
resample_kernel
The kernel used for resampling during upsampling/downsampling.
- Type: tuple
kernel
The size of the convolution kernel.
- Type: int
use_bias
Flag to indicate if bias should be used.
Type: bool
Parameters:
- in_ch (int) – Number of input channels.
- out_ch (int) – Number of output channels.
- kernel (int) – Size of the convolution kernel (must be odd and >= 1).
- up (bool , optional) – If True, the layer will perform upsampling. Default is False.
- down (bool , optional) – If True, the layer will perform downsampling. Default is False.
- resample_kernel (tuple , optional) – Kernel for resampling. Default is (1, 3, 3, 1).
- use_bias (bool , optional) – If True, a bias term will be added to the output. Default is True.
- kernel_init (callable , optional) – Function to initialize the kernel weights.
Returns: The output tensor after applying the convolution, upsampling, or downsampling.
Return type: torch.Tensor
Raises:AssertionError – If both up and down are set to True, or if kernel size is not valid.
####### Examples
>>> conv_layer = Conv2d(in_ch=3, out_ch=64, kernel=3, up=True)
>>> input_tensor = torch.randn(1, 3, 64, 64)
>>> output_tensor = conv_layer(input_tensor)
>>> print(output_tensor.shape)
torch.Size([1, 64, 128, 128])
>>> conv_layer = Conv2d(in_ch=3, out_ch=64, kernel=3, down=True)
>>> output_tensor = conv_layer(input_tensor)
>>> print(output_tensor.shape)
torch.Size([1, 64, 32, 32])
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(x)
Conv2d layer with optimal upsampling and downsampling (StyleGAN2).
This class implements a convolutional layer that can perform upsampling, downsampling, or standard convolution based on the parameters provided. It uses optimized techniques inspired by the StyleGAN2 architecture for improved performance.
weight
The learnable weight tensor for the convolution.
- Type: nn.Parameter
bias
The learnable bias tensor for the convolution.
- Type: nn.Parameter, optional
up
Flag to indicate if upsampling should be performed.
- Type: bool
down
Flag to indicate if downsampling should be performed.
- Type: bool
resample_kernel
The kernel used for resampling.
- Type: tuple
kernel
The size of the convolution kernel.
- Type: int
use_bias
Flag to indicate if bias should be used in the convolution.
Type: bool
Parameters:
- in_ch (int) – Number of input channels.
- out_ch (int) – Number of output channels.
- kernel (int) – Size of the convolution kernel (must be odd).
- up (bool , optional) – If True, the layer will perform upsampling. Defaults to False.
- down (bool , optional) – If True, the layer will perform downsampling. Defaults to False.
- resample_kernel (tuple , optional) – FIR filter kernel for resampling. Defaults to (1, 3, 3, 1).
- use_bias (bool , optional) – If True, adds a bias term. Defaults to True.
- kernel_init (callable , optional) – A function to initialize the kernel weights. Defaults to None.
Raises:AssertionError – If both up and down are set to True or if the kernel size is not valid.
####### Examples
>>> conv_layer = Conv2d(in_ch=3, out_ch=16, kernel=3, up=True)
>>> input_tensor = torch.randn(1, 3, 64, 64) # Batch of 3-channel images
>>> output_tensor = conv_layer(input_tensor)
>>> print(output_tensor.shape) # Output shape after upsampling
NOTE
This implementation is inspired by the StyleGAN2 architecture, and aims to provide efficient upsampling and downsampling operations.