espnet2.gan_tts.parallel_wavegan.parallel_wavegan.ParallelWaveGANDiscriminator

About 2 min

espnet2.gan_tts.parallel_wavegan.parallel_wavegan.ParallelWaveGANDiscriminator

class espnet2.gan_tts.parallel_wavegan.parallel_wavegan.ParallelWaveGANDiscriminator(in_channels: int = 1, out_channels: int = 1, kernel_size: int = 3, layers: int = 10, conv_channels: int = 64, dilation_factor: int = 1, nonlinear_activation: str = 'LeakyReLU', nonlinear_activation_params: Dict[str, Any] = {'negative_slope': 0.2}, bias: bool = True, use_weight_norm: bool = True)

Bases: Module

Parallel WaveGAN Discriminator module.

This class implements the Discriminator for the Parallel WaveGAN model, which is used for distinguishing real audio samples from generated ones. It utilizes a series of convolutional layers with optional weight normalization and nonlinear activation functions to process the input audio signals.

Parameters:
- in_channels (int) – Number of input channels. Default is 1.
- out_channels (int) – Number of output channels. Default is 1.
- kernel_size (int) – Kernel size for convolutional layers. Default is 3.
- layers (int) – Number of convolutional layers. Default is 10.
- conv_channels (int) – Number of channels in each convolutional layer. Default is 64.
- dilation_factor (int) – Dilation factor for convolutions. If set to 2, the dilation will be 2, 4, 8, …, etc. Default is 1.
- nonlinear_activation (str) – Nonlinear activation function after each convolution. Default is “LeakyReLU”.
- nonlinear_activation_params (Dict *[*str , Any ]) – Parameters for the nonlinear activation function. Default is {“negative_slope”: 0.2}.
- bias (bool) – Whether to use bias parameter in convolution layers. Default is True.
- use_weight_norm (bool) – Whether to apply weight normalization to convolutional layers. Default is True.
Raises:AssertionError – If kernel_size is even or dilation_factor is not greater than 0.

######

Example

>>> discriminator = ParallelWaveGANDiscriminator()
>>> input_tensor = torch.randn(1, 1, 100)  # Batch size 1, 1 channel, 100 time steps
>>> output_tensor = discriminator(input_tensor)
>>> print(output_tensor.shape)  # Should be (1, 1, 100)

####### NOTE The Discriminator is designed to work in conjunction with the ParallelWaveGANGenerator to train a GAN for audio synthesis.

Initialize ParallelWaveGANDiscriminator module.

Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Number of output channels.
- layers (int) – Number of conv layers.
- conv_channels (int) – Number of chnn layers.
- dilation_factor (int) – Dilation factor. For example, if dilation_factor = 2, the dilation will be 2, 4, 8, …, and so on.
- nonlinear_activation (str) – Nonlinear function after each conv.
- nonlinear_activation_params (Dict *[*str , Any ]) – Nonlinear function parameters
- bias (bool) – Whether to use bias parameter in conv.
- use_weight_norm (bool) – If set to true, it will be applied to all of the conv layers.

apply_weight_norm()

Apply weight normalization module from all of the layers.

This method applies weight normalization to all convolutional layers in the module. Weight normalization can help in stabilizing the training of deep neural networks and can lead to faster convergence.

The method uses PyTorch’s built-in weight_norm utility to apply normalization to Conv1d and Conv2d layers.

######

Example

>>> model = ParallelWaveGANDiscriminator()
>>> model.apply_weight_norm()
# Weight normalization is now applied to all convolutional layers.

####### NOTE Weight normalization is applied during the initialization of the ParallelWaveGANDiscriminator class if the use_weight_norm argument is set to True.

forward(x: Tensor) → Tensor

Calculate forward propagation.

Parameters:x (Tensor) – Input noise signal (B, 1, T).
Returns: Output tensor (B, 1, T).
Return type: Tensor

######

Example

>>> discriminator = ParallelWaveGANDiscriminator()
>>> input_tensor = torch.randn(8, 1, 16000)  # Batch of 8, 1 channel, 16000 samples
>>> output = discriminator(input_tensor)
>>> print(output.shape)  # Should be (8, 1, 16000)

remove_weight_norm()

Remove weight normalization module from all of the layers.

This method iterates through all the layers of the ParallelWaveGANDiscriminator and removes weight normalization if it has been applied. Weight normalization can improve the training dynamics of deep learning models, but there may be scenarios where you want to disable it, such as for inference or model evaluation.

Example

Create an instance of the discriminator

discriminator = ParallelWaveGANDiscriminator()

Apply weight normalization

discriminator.apply_weight_norm()

Remove weight normalization

discriminator.remove_weight_norm()

####### NOTE This method does not raise any exceptions if a layer does not have weight normalization applied.