espnet2.gan_tts.style_melgan.tade_res_block.TADEResBlock
espnet2.gan_tts.style_melgan.tade_res_block.TADEResBlock
class espnet2.gan_tts.style_melgan.tade_res_block.TADEResBlock(in_channels: int = 64, aux_channels: int = 80, kernel_size: int = 9, dilation: int = 2, bias: bool = True, upsample_factor: int = 2, upsample_mode: str = 'nearest', gated_function: str = 'softmax')
Bases: Module
TADEResBlock module for the StyleMelGAN architecture.
This module implements a residual block that incorporates two TADE layers for style-based mel-spectrogram generation. It processes input and auxiliary tensors to produce an output tensor while applying gated convolutions and upsampling. This design is adapted from the original ParallelWaveGAN code.
tade1
The first TADE layer in the residual block.
- Type:TADELayer
gated_conv1
The first gated convolution layer.
- Type:Conv1d
tade2
The second TADE layer in the residual block.
- Type:TADELayer
gated_conv2
The second gated convolution layer.
- Type:Conv1d
upsample
Upsampling layer for the output tensor.
- Type:Upsample
gated_function
The gating function applied in the block.
Type: Callable
Parameters:
- in_channels (int) – Number of input channels. Default is 64.
- aux_channels (int) – Number of auxiliary channels. Default is 80.
- kernel_size (int) – Size of the convolutional kernel. Default is 9.
- dilation (int) – Dilation rate for the second gated convolution. Default is 2.
- bias (bool) – Whether to use a bias parameter in convolutions. Default is True.
- upsample_factor (int) – Factor by which to upsample the output. Default is 2.
- upsample_mode (str) – Mode of upsampling (e.g., ‘nearest’). Default is ‘nearest’.
- gated_function (str) – Type of gated function (‘softmax’ or ‘sigmoid’). Default is ‘softmax’.
Raises:ValueError – If an unsupported gated_function type is provided.
Examples
>>> res_block = TADEResBlock(in_channels=64, aux_channels=80)
>>> x = torch.randn(1, 64, 100) # Input tensor
>>> c = torch.randn(1, 80, 50) # Auxiliary tensor
>>> output, aux = res_block(x, c)
>>> print(output.shape) # Output shape will be (1, 64, 200)
Initialize TADEResBlock module.
- Parameters:
- in_channels (int) – Number of input channles.
- aux_channels (int) – Number of auxirialy channles.
- kernel_size (int) – Kernel size.
- bias (bool) – Whether to use bias parameter in conv.
- upsample_factor (int) – Upsample factor.
- upsample_mode (str) – Upsample mode.
- gated_function (str) – Gated function type (softmax of sigmoid).
forward(x: Tensor, c: Tensor) → Tensor
Calculate forward propagation.
- Parameters:
- x (Tensor) – Input tensor (B, in_channels, T).
- c (Tensor) – Auxiliary input tensor (B, aux_channels, T’).
- Returns: Output tensor (B, in_channels, T * in_upsample_factor). Tensor: Upsampled auxirialy tensor (B, in_channels, T * in_upsample_factor).
- Return type: Tensor