espnet2.tts.utils.parallel_wavegan_pretrained_vocoder.ParallelWaveGANPretrainedVocoder
espnet2.tts.utils.parallel_wavegan_pretrained_vocoder.ParallelWaveGANPretrainedVocoder
class espnet2.tts.utils.parallel_wavegan_pretrained_vocoder.ParallelWaveGANPretrainedVocoder(model_file: Path | str, config_file: Path | str | None = None)
Bases: Module
Wrapper class to load the vocoder trained with parallel_wavegan repo.
This class is designed to facilitate the loading and utilization of a vocoder model that has been trained using the Parallel WaveGAN framework. It integrates with the PyTorch framework and allows for generating waveforms from input features.
fs
Sampling rate of the vocoder.
- Type: int
vocoder
The loaded vocoder model.
Type: torch.nn.Module
Parameters:
- model_file (Union *[*Path , str ]) – Path to the model file.
- config_file (Optional *[*Union *[*Path , str ] ]) – Path to the configuration file. If None, the configuration is loaded from the same directory as the model file, with the name “config.yml”.
Raises:
- ImportError – If the parallel_wavegan package is not installed.
- FileNotFoundError – If the configuration file does not exist.
####### Examples
>>> vocoder = ParallelWaveGANPretrainedVocoder("path/to/model/file")
>>> waveform = vocoder(torch.randn(100, 80)) # Assuming 80 mel features.
NOTE
Ensure that the parallel_wavegan library is installed for this class to function correctly.
Initialize ParallelWaveGANPretrainedVocoder module.
forward(feats: Tensor) → Tensor
Generate waveform with pretrained vocoder.
This method takes a feature tensor as input and generates a corresponding waveform tensor using the pretrained vocoder model.
- Parameters:feats (torch.Tensor) – Feature tensor of shape (T_feats, #mels), where T_feats is the number of frames and #mels is the number of mel frequency bins.
- Returns: Generated waveform tensor of shape (T_wav), where T_wav : is the length of the output waveform.
- Return type: torch.Tensor
####### Examples
>>> import torch
>>> vocoder = ParallelWaveGANPretrainedVocoder("path/to/model/file")
>>> features = torch.randn(100, 80) # Example mel features
>>> waveform = vocoder(features)
>>> print(waveform.shape) # Output shape should be (T_wav,)