espnet2.gan_codec.shared.decoder.seanet.SConvTranspose1d
espnet2.gan_codec.shared.decoder.seanet.SConvTranspose1d
class espnet2.gan_codec.shared.decoder.seanet.SConvTranspose1d(in_channels: int, out_channels: int, kernel_size: int, stride: int = 1, causal: bool = False, norm: str = 'none', trim_right_ratio: float = 1.0, norm_kwargs: Dict[str, Any] = {})
Bases: Module
SConvTranspose1d is a 1D transposed convolution layer that incorporates built-in
handling of asymmetric or causal padding and normalization. This class is designed to provide a consistent interface for various normalization approaches while managing the complexities of transposed convolutions.
convtr
The wrapped transposed convolution layer with normalization applied.
- Type:NormConvTranspose1d
causal
Indicates if the convolution is causal.
- Type: bool
trim_right_ratio
Ratio for trimming the output on the right side.
Type: float
Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Size of the convolution kernel.
- stride (int , optional) – Stride of the convolution. Defaults to 1.
- causal (bool , optional) – If True, use causal convolution. Defaults to False.
- norm (str , optional) – Normalization method. Defaults to “none”.
- trim_right_ratio (float , optional) – Ratio for trimming at the right of the transposed convolution. Defaults to 1.0.
- norm_kwargs (Dict *[*str , Any ] , optional) – Additional parameters for the normalization module. Defaults to an empty dictionary.
Raises:
- AssertionError – If trim_right_ratio is not in the range [0.0, 1.0] or if
- trim_right_ratio –
####### Examples
>>> layer = SConvTranspose1d(in_channels=16, out_channels=32,
... kernel_size=3, stride=2, causal=True)
>>> input_tensor = torch.randn(10, 16, 50) # Batch size 10, 16 channels, length 50
>>> output_tensor = layer(input_tensor)
>>> print(output_tensor.shape) # Output shape will depend on kernel size and stride
NOTE
The trim_right_ratio should be set to a value between 0.0 and 1.0. When causal is set to True, the trimming will be applied to the right side of the output tensor according to the specified ratio.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(x)
Forward pass for the SEANetDecoder.
This method takes an input tensor z and passes it through the decoder network, applying a series of convolutions and activations to produce the output.
- Parameters:z (torch.Tensor) – The input tensor, typically representing the encoded audio signal with shape (batch_size, channels, length).
- Returns: The output tensor after passing through the decoder, : representing the reconstructed audio signal.
- Return type: torch.Tensor
####### Examples
>>> decoder = SEANetDecoder(channels=1)
>>> input_tensor = torch.randn(16, 32, 100) # Example input
>>> output_tensor = decoder(input_tensor)
>>> print(output_tensor.shape)
torch.Size([16, 1, <length>]) # Output shape will depend on the
# specific configuration of the model.
NOTE
The output shape will depend on the architecture defined during the initialization of the SEANetDecoder, particularly the kernel sizes, strides, and padding applied throughout the model.
- Raises:ValueError – If the input tensor z does not have the expected shape or type.