espnet2.gan_codec.shared.decoder.seanet_2d.SConvTranspose2d

About 2 min

espnet2.gan_codec.shared.decoder.seanet_2d.SConvTranspose2d

class espnet2.gan_codec.shared.decoder.seanet_2d.SConvTranspose2d(in_channels: int, out_channels: int, kernel_size: int | Tuple[int, int], stride: int | Tuple[int, int] = 1, causal: bool = False, norm: str = 'none', trim_right_ratio: float = 1.0, norm_kwargs: Dict[str, Any] = {}, out_padding: int | List[Tuple[int, int]] = 0, groups: int = 1)

Bases: Module

ConvTranspose2d with built-in handling of asymmetric or causal padding and normalization.

NOTE

Causal padding only makes sense on the time (the last) axis. The frequency (the second last) axis is always non-causally padded.

convtr

The convolutional transpose layer with normalization.

Type:NormConvTranspose2d

out_padding

Padding to be added to the output.

Type: List[Tuple[int, int]]

causal

Whether to use causal padding.

Type: bool

trim_right_ratio

Ratio for trimming the output on the right side.

Type: float
Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (Union *[*int , Tuple *[*int , int ] ]) – Size of the convolving kernel.
- stride (Union *[*int , Tuple *[*int , int ] ] , optional) – Stride of the convolution. Defaults to 1.
- causal (bool , optional) – If True, use causal padding. Defaults to False.
- norm (str , optional) – Type of normalization to apply. Defaults to “none”.
- trim_right_ratio (float , optional) – Ratio for trimming at the right of the transposed convolution under the causal setup. Defaults to 1.0.
- norm_kwargs (Dict *[*str , Any ] , optional) – Additional arguments for normalization.
- out_padding (Union *[*int , List *[*Tuple *[*int , int ] ] ] , optional) – Padding added to the output. Defaults to 0.
- groups (int , optional) – Number of blocked connections from input channels to output channels. Defaults to 1.
Raises:
- AssertionError – If trim_right_ratio is not 1.0 and causal is False.
- AssertionError – If trim_right_ratio is not between 0.0 and 1.0.

####### Examples

>>> layer = SConvTranspose2d(in_channels=16, out_channels=33,
...                           kernel_size=(3, 3), stride=(2, 2),
...                           causal=True, trim_right_ratio=0.5)
>>> input_tensor = torch.randn(1, 16, 10, 10)
>>> output_tensor = layer(input_tensor)
>>> output_tensor.shape
torch.Size([1, 33, 20, 20])

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Applies the transposed convolution and normalization to the input tensor.

This method first applies a transposed convolution operation followed by a normalization step. The transposed convolution is performed using the ConvTranspose2d layer, and the normalization is applied based on the specified normalization method during the initialization of the class.

Parameters:x (torch.Tensor) – Input tensor of shape (N, C, H, W), where N is the batch size, C is the number of input channels, H is the height, and W is the width of the input tensor.
Returns: The output tensor after applying the transposed convolution and normalization. The output tensor will have the same shape as the input tensor (N, C, H, W) if the parameters are set accordingly.
Return type: torch.Tensor

####### Examples

>>> model = SConvTranspose2d(in_channels=1, out_channels=1, kernel_size=3)
>>> input_tensor = torch.randn(1, 1, 64, 64)  # Example input
>>> output_tensor = model(input_tensor)
>>> print(output_tensor.shape)  # Output shape will be (1, 1, 64, 64)

NOTE

Ensure that the input tensor x has the correct shape as specified.
The normalization method and its parameters can be adjusted during

initialization of the class.