espnet2.gan_tts.melgan.residual_stack.ResidualStack
espnet2.gan_tts.melgan.residual_stack.ResidualStack
class espnet2.gan_tts.melgan.residual_stack.ResidualStack(kernel_size: int = 3, channels: int = 32, dilation: int = 1, bias: bool = True, nonlinear_activation: str = 'LeakyReLU', nonlinear_activation_params: Dict[str, Any] = {'negative_slope': 0.2}, pad: str = 'ReflectionPad1d', pad_params: Dict[str, Any] = {})
Bases: Module
Residual stack module in MelGAN.
This code is modified from https://github.com/kan-bayashi/ParallelWaveGAN.
stack
A sequential container for the residual stack.
- Type: torch.nn.Sequential
skip_layer
A convolutional layer for skip connections.
Type: torch.nn.Conv1d
Parameters:
- kernel_size (int) – Kernel size of dilation convolution layer.
- channels (int) – Number of channels of convolution layers.
- dilation (int) – Dilation factor.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- pad (str) – Padding function module name before dilated convolution layer.
- pad_params (Dict *[*str , Any ]) – Hyperparameters for padding function.
Returns: None
####### Examples
>>> residual_stack = ResidualStack(kernel_size=3, channels=32)
>>> input_tensor = torch.randn(1, 32, 100) # (B, channels, T)
>>> output_tensor = residual_stack(input_tensor)
>>> print(output_tensor.shape) # Should be (1, 32, 100)
- Raises:AssertionError – If the kernel size is an even number.
NOTE
The residual stack combines convolutional layers with skip connections to improve the learning capability of the model.
Initialize ResidualStack module.
- Parameters:
- kernel_size (int) – Kernel size of dilation convolution layer.
- channels (int) – Number of channels of convolution layers.
- dilation (int) – Dilation factor.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- pad (str) – Padding function module name before dilated convolution layer.
- pad_params (Dict *[*str , Any ]) – Hyperparameters for padding function.
forward(c: Tensor) → Tensor
Residual stack module introduced in MelGAN.
This code is modified from https://github.com/kan-bayashi/ParallelWaveGAN.
stack
Sequential container for the residual stack.
- Type: torch.nn.Sequential
skip_layer
Convolution layer for the skip connection.
Type: torch.nn.Conv1d
Parameters:
- kernel_size (int) – Kernel size of dilation convolution layer.
- channels (int) – Number of channels of convolution layers.
- dilation (int) – Dilation factor.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- pad (str) – Padding function module name before dilated convolution layer.
- pad_params (Dict *[*str , Any ]) – Hyperparameters for padding function.
####### Examples
>>> residual_stack = ResidualStack(kernel_size=3, channels=32)
>>> input_tensor = torch.randn(1, 32, 100) # Batch size of 1, 32 channels, 100 time steps
>>> output_tensor = residual_stack(input_tensor)
>>> output_tensor.shape
torch.Size([1, 32, 100])