espnet2.gan_tts.hifigan.hifigan.HiFiGANPeriodDiscriminator
espnet2.gan_tts.hifigan.hifigan.HiFiGANPeriodDiscriminator
class espnet2.gan_tts.hifigan.hifigan.HiFiGANPeriodDiscriminator(in_channels: int = 1, out_channels: int = 1, period: int = 3, kernel_sizes: List[int] = [5, 3], channels: int = 32, downsample_scales: List[int] = [3, 3, 3, 3, 1], max_downsample_channels: int = 1024, bias: bool = True, nonlinear_activation: str = 'LeakyReLU', nonlinear_activation_params: Dict[str, Any] = {'negative_slope': 0.1}, use_weight_norm: bool = True, use_spectral_norm: bool = False)
Bases: Module
HiFiGAN period discriminator module.
This module implements a period discriminator for the HiFi-GAN architecture. It utilizes convolutional layers to classify audio signals based on periodicity.
period
The period length for the discriminator.
- Type: int
convs
A list of convolutional layers used in the model.
Type: ModuleList
Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- period (int) – Period length.
- kernel_sizes (list) – Kernel sizes for initial and final convolution layers.
- channels (int) – Number of initial channels.
- downsample_scales (List *[*int ]) – List of downsampling scales.
- max_downsample_channels (int) – Maximum number of downsampling channels.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for the activation function.
- use_weight_norm (bool) – Whether to apply weight normalization to all conv layers.
- use_spectral_norm (bool) – Whether to apply spectral normalization to all conv layers.
Raises:ValueError – If both use_weight_norm and use_spectral_norm are set to True.
########### Examples
>>> discriminator = HiFiGANPeriodDiscriminator()
>>> input_tensor = torch.randn(8, 1, 256) # Batch size 8, 1 channel, length 256
>>> output = discriminator(input_tensor)
>>> len(output) # Output is a list of tensors from each layer
5
######## NOTE The input tensor is reshaped to accommodate the period during processing.
Initialize HiFiGANPeriodDiscriminator module.
- Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- period (int) – Period.
- kernel_sizes (list) – Kernel sizes of initial conv layers and the final conv layer.
- channels (int) – Number of initial channels.
- downsample_scales (List *[*int ]) – List of downsampling scales.
- max_downsample_channels (int) – Number of maximum downsampling channels.
- use_additional_convs (bool) – Whether to use additional conv layers in residual blocks.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- use_weight_norm (bool) – Whether to use weight norm. If set to true, it will be applied to all of the conv layers.
- use_spectral_norm (bool) – Whether to use spectral norm. If set to true, it will be applied to all of the conv layers.
apply_spectral_norm()
Apply spectral normalization module from all of the layers.
This method applies spectral normalization to all Conv2d layers within the HiFiGANPeriodDiscriminator module. Spectral normalization is a technique used to stabilize the training of generative adversarial networks (GANs) by controlling the Lipschitz constant of the network.
######## NOTE This method modifies the layers in place, so it is recommended to call this method after the module has been initialized.
########### Examples
>>> discriminator = HiFiGANPeriodDiscriminator(use_spectral_norm=True)
>>> discriminator.apply_spectral_norm()
apply_weight_norm()
Apply weight normalization module from all of the layers.
This method applies weight normalization to all convolutional layers in the HiFiGANPeriodDiscriminator. Weight normalization can improve the training speed and stability of the model by reparameterizing the weights of the layers. It is important to note that weight normalization should be applied during the initialization of the model if the use_weight_norm parameter is set to True.
########### Examples
>>> discriminator = HiFiGANPeriodDiscriminator(use_weight_norm=True)
>>> discriminator.apply_weight_norm()
######## NOTE This method logs a debug message for each layer that weight normalization is applied to, aiding in tracking the model’s structure during development and debugging.
forward(x: Tensor) → Tensor
Calculate forward propagation.
This method computes the forward pass of the HiFiGAN generator by processing the input tensor through several convolutional layers, upsampling layers, and residual blocks. If a global conditioning tensor is provided, it will be added to the processed input before proceeding through the network.
- Parameters:
- c (torch.Tensor) – Input tensor of shape (B, in_channels, T), where B is the batch size, in_channels is the number of input channels, and T is the length of the input sequence.
- g (Optional *[*torch.Tensor ]) – Global conditioning tensor of shape (B, global_channels, 1). This tensor is optional and, if provided, is added to the input tensor after the initial convolution.
- Returns: Output tensor of shape (B, out_channels, T), : where out_channels is the number of output channels.
- Return type: torch.Tensor
########### Examples
>>> generator = HiFiGANGenerator()
>>> input_tensor = torch.randn(1, 80, 100) # Example input
>>> output_tensor = generator(input_tensor)
>>> print(output_tensor.shape) # Output shape should be (1, 1, T)
######## NOTE The input tensor must have the correct number of channels as specified during the initialization of the HiFiGANGenerator. The global conditioning tensor must have the same batch size as the input tensor if provided.
- Raises:
- AssertionError – If the input tensor does not match the expected
- shape or if the global conditioning tensor has an incompatible shape. –