espnet2.gan_tts.vits.flow.DilatedDepthSeparableConv

Less than 1 minute

espnet2.gan_tts.vits.flow.DilatedDepthSeparableConv

class espnet2.gan_tts.vits.flow.DilatedDepthSeparableConv(channels: int, kernel_size: int, layers: int, dropout_rate: float = 0.0, eps: float = 1e-05)

Bases: Module

Dilated depth-separable convolution module.

This module implements a dilated depth-separable convolution, which consists of multiple layers of dilated convolutions followed by normalization and activation functions. It is designed to process 1D input tensors commonly used in sequence-based tasks.

convs

A list of sequential convolutional layers.

Type: ModuleList
Parameters:
- channels (int) – Number of channels.
- kernel_size (int) – Size of the convolution kernel.
- layers (int) – Number of convolutional layers to stack.
- dropout_rate (float) – Rate of dropout to apply after each layer.
- eps (float) – Small constant for numerical stability in layer normalization.

Examples

>>> dsc = DilatedDepthSeparableConv(channels=64, kernel_size=3, layers=2)
>>> x = torch.randn(10, 64, 100)  # (B, channels, T)
>>> x_mask = torch.ones(10, 1, 100)  # Mask tensor
>>> output = dsc(x, x_mask)
>>> print(output.shape)  # Should be (10, 64, 100)

Returns: Output tensor with the same shape as input (B, channels, T).
Return type: Tensor

Initialize DilatedDepthSeparableConv module.

Parameters:
- channels (int) – Number of channels.
- kernel_size (int) – Kernel size.
- layers (int) – Number of layers.
- dropout_rate (float) – Dropout rate.
- eps (float) – Epsilon for layer norm.

forward(x: Tensor, x_mask: Tensor, g: Tensor | None = None) → Tensor

Calculate forward propagation.

Parameters:
- x (Tensor) – Input tensor (B, in_channels, T).
- x_mask (Tensor) – Mask tensor (B, 1, T).
- g (Optional *[*Tensor ]) – Global conditioning tensor (B, global_channels, 1).
Returns: Output tensor (B, channels, T).
Return type: Tensor