espnet2.gan_tts.hifigan.hifigan.HiFiGANMultiPeriodDiscriminator

About 2 min

espnet2.gan_tts.hifigan.hifigan.HiFiGANMultiPeriodDiscriminator

class espnet2.gan_tts.hifigan.hifigan.HiFiGANMultiPeriodDiscriminator(periods: List[int] = [2, 3, 5, 7, 11], discriminator_params: Dict[str, Any] = {'bias': True, 'channels': 32, 'downsample_scales': [3, 3, 3, 3, 1], 'in_channels': 1, 'kernel_sizes': [5, 3], 'max_downsample_channels': 1024, 'nonlinear_activation': 'LeakyReLU', 'nonlinear_activation_params': {'negative_slope': 0.1}, 'out_channels': 1, 'use_spectral_norm': False, 'use_weight_norm': True})

Bases: Module

HiFiGAN multi-period discriminator module.

This module implements a multi-period discriminator for the HiFi-GAN model, which is used to distinguish between real and generated audio signals by utilizing multiple periods. The architecture is designed to capture various frequency patterns, improving the overall performance of the GAN during training.

Parameters:
- periods (List *[*int ]) – List of periods used in the discriminator.
- discriminator_params (Dict *[*str , Any ]) – Parameters for the HiFi-GAN period discriminator module. The ‘period’ parameter will be overwritten by the values in ‘periods’.

discriminators

A list of HiFiGANPeriodDiscriminator instances, one for each specified period.

Type: torch.nn.ModuleList
Returns: A list containing outputs from each period discriminator.
Return type: List

####### Examples

>>> discriminator = HiFiGANMultiPeriodDiscriminator()
>>> input_tensor = torch.randn(8, 1, 1024)  # Batch of 8 audio signals
>>> outputs = discriminator(input_tensor)
>>> len(outputs)  # Should equal the number of specified periods

Initialize HiFiGANMultiPeriodDiscriminator module.

Parameters:
- periods (List *[*int ]) – List of periods.
- discriminator_params (Dict *[*str , Any ]) – Parameters for hifi-gan period discriminator module. The period parameter will be overwritten.

forward(x: Tensor) → Tensor

Calculate forward propagation.

This method computes the forward pass of the HiFiGAN generator by processing the input tensor through several convolutional layers, upsampling layers, and residual blocks. If a global conditioning tensor is provided, it will be added to the processed input before proceeding through the network.

Parameters:
- c (torch.Tensor) – Input tensor of shape (B, in_channels, T), where B is the batch size, in_channels is the number of input channels, and T is the length of the input sequence.
- g (Optional *[*torch.Tensor ]) – Global conditioning tensor of shape (B, global_channels, 1). This tensor is optional and, if provided, is added to the input tensor after the initial convolution.
Returns: Output tensor of shape (B, out_channels, T), : where out_channels is the number of output channels.
Return type: torch.Tensor

####### Examples

>>> generator = HiFiGANGenerator()
>>> input_tensor = torch.randn(1, 80, 100)  # Example input
>>> output_tensor = generator(input_tensor)
>>> print(output_tensor.shape)  # Output shape should be (1, 1, T)

NOTE

The input tensor must have the correct number of channels as specified during the initialization of the HiFiGANGenerator. The global conditioning tensor must have the same batch size as the input tensor if provided.

Raises:
- AssertionError – If the input tensor does not match the expected
- shape or if the global conditioning tensor has an incompatible shape. –