espnet2.enh.layers.ncsnpp_utils.upfirdn2d.upfirdn2d_native

About 1 min

espnet2.enh.layers.ncsnpp_utils.upfirdn2d.upfirdn2d_native

espnet2.enh.layers.ncsnpp_utils.upfirdn2d.upfirdn2d_native(input, kernel, up_x, up_y, down_x, down_y, pad_x0, pad_x1, pad_y0, pad_y1)

UpFirDn2d functions for upsampling, padding, FIR filter, and downsampling.

This module provides the upfirdn2d and upfirdn2d_native functions, which are used for upsampling, padding, and downsampling operations on 2D input data with an optional FIR filter applied. The functions are ported from https://github.com/NVlabs/stylegan2.

The upfirdn2d function serves as a high-level interface that utilizes upfirdn2d_native for the actual processing.

Parameters:
- input (torch.Tensor) – The input tensor of shape (N, C, H, W) where N is the batch size, C is the number of channels, H is the height, and W is the width.
- kernel (torch.Tensor) – The FIR filter kernel of shape (kH, kW).
- up (int , optional) – Upsampling factor for both dimensions. Defaults to 1.
- down (int , optional) – Downsampling factor for both dimensions. Defaults to 1.
- pad (tuple , optional) – A tuple of two integers (pad_x, pad_y) for padding in the width and height dimensions. Defaults to (0, 0).
Returns: The output tensor after applying upsampling, padding, FIR filtering, and downsampling. The shape of the output is (N, C, out_h, out_w) where out_h and out_w are computed based on the input size and the operations performed.
Return type: torch.Tensor

Examples

>>> import torch
>>> input_tensor = torch.randn(1, 3, 64, 64)  # Example input
>>> kernel = torch.tensor([[1, 2, 1], [2, 4, 2], [1, 2, 1]])  # Example kernel
>>> output = upfirdn2d(input_tensor, kernel, up=2, down=1, pad=(1, 1))
>>> print(output.shape)  # Expected output shape: (1, 3, 128, 128)

NOTE

The kernel should be a 2D tensor and it is expected to have odd dimensions for proper centering during convolution.

Raises:
- ValueError – If the input tensor has less than 3 dimensions or if the
- kernel has an invalid shape. –