espnet2.gan_tts.melgan.residual_stack.ResidualStack

About 1 min

espnet2.gan_tts.melgan.residual_stack.ResidualStack

class espnet2.gan_tts.melgan.residual_stack.ResidualStack(kernel_size: int = 3, channels: int = 32, dilation: int = 1, bias: bool = True, nonlinear_activation: str = 'LeakyReLU', nonlinear_activation_params: Dict[str, Any] = {'negative_slope': 0.2}, pad: str = 'ReflectionPad1d', pad_params: Dict[str, Any] = {})

Bases: Module

Residual stack module in MelGAN.

This code is modified from https://github.com/kan-bayashi/ParallelWaveGAN.

stack

A sequential container for the residual stack.

Type: torch.nn.Sequential

skip_layer

A convolutional layer for skip connections.

Type: torch.nn.Conv1d
Parameters:
- kernel_size (int) – Kernel size of dilation convolution layer.
- channels (int) – Number of channels of convolution layers.
- dilation (int) – Dilation factor.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- pad (str) – Padding function module name before dilated convolution layer.
- pad_params (Dict *[*str , Any ]) – Hyperparameters for padding function.
Returns: None

####### Examples

>>> residual_stack = ResidualStack(kernel_size=3, channels=32)
>>> input_tensor = torch.randn(1, 32, 100)  # (B, channels, T)
>>> output_tensor = residual_stack(input_tensor)
>>> print(output_tensor.shape)  # Should be (1, 32, 100)

Raises:AssertionError – If the kernel size is an even number.

NOTE

The residual stack combines convolutional layers with skip connections to improve the learning capability of the model.

Initialize ResidualStack module.

Parameters:
- kernel_size (int) – Kernel size of dilation convolution layer.
- channels (int) – Number of channels of convolution layers.
- dilation (int) – Dilation factor.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- pad (str) – Padding function module name before dilated convolution layer.
- pad_params (Dict *[*str , Any ]) – Hyperparameters for padding function.

forward(c: Tensor) → Tensor

Residual stack module introduced in MelGAN.

This code is modified from https://github.com/kan-bayashi/ParallelWaveGAN.

stack

Sequential container for the residual stack.

Type: torch.nn.Sequential

skip_layer

Convolution layer for the skip connection.

Type: torch.nn.Conv1d
Parameters:
- kernel_size (int) – Kernel size of dilation convolution layer.
- channels (int) – Number of channels of convolution layers.
- dilation (int) – Dilation factor.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- pad (str) – Padding function module name before dilated convolution layer.
- pad_params (Dict *[*str , Any ]) – Hyperparameters for padding function.

####### Examples

>>> residual_stack = ResidualStack(kernel_size=3, channels=32)
>>> input_tensor = torch.randn(1, 32, 100)  # Batch size of 1, 32 channels, 100 time steps
>>> output_tensor = residual_stack(input_tensor)
>>> output_tensor.shape
torch.Size([1, 32, 100])