espnet2.gan_codec.shared.decoder.seanet.SConvTranspose1d

About 2 min

espnet2.gan_codec.shared.decoder.seanet.SConvTranspose1d

class espnet2.gan_codec.shared.decoder.seanet.SConvTranspose1d(in_channels: int, out_channels: int, kernel_size: int, stride: int = 1, causal: bool = False, norm: str = 'none', trim_right_ratio: float = 1.0, norm_kwargs: Dict[str, Any] = {})

Bases: Module

SConvTranspose1d is a 1D transposed convolution layer that incorporates built-in

handling of asymmetric or causal padding and normalization. This class is designed to provide a consistent interface for various normalization approaches while managing the complexities of transposed convolutions.

convtr

The wrapped transposed convolution layer with normalization applied.

Type:NormConvTranspose1d

causal

Indicates if the convolution is causal.

Type: bool

trim_right_ratio

Ratio for trimming the output on the right side.

Type: float
Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Size of the convolution kernel.
- stride (int , optional) – Stride of the convolution. Defaults to 1.
- causal (bool , optional) – If True, use causal convolution. Defaults to False.
- norm (str , optional) – Normalization method. Defaults to “none”.
- trim_right_ratio (float , optional) – Ratio for trimming at the right of the transposed convolution. Defaults to 1.0.
- norm_kwargs (Dict *[*str , Any ] , optional) – Additional parameters for the normalization module. Defaults to an empty dictionary.
Raises:
- AssertionError – If trim_right_ratio is not in the range [0.0, 1.0] or if
- trim_right_ratio –

####### Examples

>>> layer = SConvTranspose1d(in_channels=16, out_channels=32,
...                           kernel_size=3, stride=2, causal=True)
>>> input_tensor = torch.randn(10, 16, 50)  # Batch size 10, 16 channels, length 50
>>> output_tensor = layer(input_tensor)
>>> print(output_tensor.shape)  # Output shape will depend on kernel size and stride

NOTE

The trim_right_ratio should be set to a value between 0.0 and 1.0. When causal is set to True, the trimming will be applied to the right side of the output tensor according to the specified ratio.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Forward pass for the SEANetDecoder.

This method takes an input tensor z and passes it through the decoder network, applying a series of convolutions and activations to produce the output.

Parameters:z (torch.Tensor) – The input tensor, typically representing the encoded audio signal with shape (batch_size, channels, length).
Returns: The output tensor after passing through the decoder, : representing the reconstructed audio signal.
Return type: torch.Tensor

####### Examples

>>> decoder = SEANetDecoder(channels=1)
>>> input_tensor = torch.randn(16, 32, 100)  # Example input
>>> output_tensor = decoder(input_tensor)
>>> print(output_tensor.shape)
torch.Size([16, 1, &lt;length&gt;])  # Output shape will depend on the
                                 # specific configuration of the model.

NOTE

The output shape will depend on the architecture defined during the initialization of the SEANetDecoder, particularly the kernel sizes, strides, and padding applied throughout the model.

Raises:ValueError – If the input tensor z does not have the expected shape or type.