espnet2.gan_svs.uhifigan.sine_generator.SineGen
espnet2.gan_svs.uhifigan.sine_generator.SineGen
class espnet2.gan_svs.uhifigan.sine_generator.SineGen(sample_rate, harmonic_num=0, sine_amp=0.1, noise_std=0.003, voiced_threshold=0, flag_for_pulse=False)
Bases: Module
Definition of a sine generator for audio synthesis.
This class implements a sine wave generator that can produce harmonic sine waves and corresponding unvoiced/voiced (U/V) signals based on the provided fundamental frequency (F0) values. It can also introduce Gaussian noise to the generated waveforms.
sine_amp
Amplitude of the sine waveform (default: 0.1).
- Type: float
noise_std
Standard deviation of Gaussian noise (default: 0.003).
- Type: float
harmonic_num
Number of harmonic overtones (default: 0).
- Type: int
dim
Dimension of the output, equal to harmonic_num + 1.
- Type: int
sampling_rate
Sampling rate in Hz.
- Type: int
voiced_threshold
F0 threshold for U/V classification (default: 0).
- Type: float
flag_for_pulse
Indicates if the generator is used in pulse mode (default: False).
Type: bool
Parameters:
- sample_rate (int) – Sampling rate in Hz.
- harmonic_num (int , optional) – Number of harmonic overtones (default: 0).
- sine_amp (float , optional) – Amplitude of sine waveform (default: 0.1).
- noise_std (float , optional) – Standard deviation of Gaussian noise (default: 0.003).
- voiced_threshold (float , optional) – F0 threshold for U/V classification (default: 0).
- flag_for_pulse (bool , optional) – Flag indicating if the SinGen is used inside PulseGen (default: False).
NOTE
When flag_for_pulse is True, the first time step of a voiced segment is always sin(np.pi) or cos(0).
####### Examples
>>> sine_gen = SineGen(sample_rate=16000, harmonic_num=3)
>>> f0 = torch.tensor([[[440.0]]]) # Example F0 input
>>> sine_waves, uv, noise = sine_gen(f0)
>>> print(sine_waves.shape) # Output shape: (1, length, dim)
>>> print(uv.shape) # Output shape: (1, length, 1)
- Raises:ValueError – If the provided sample_rate is not a positive integer.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(f0)
Forward SineGen.
Computes the sine waveforms and the voiced/unvoiced (uv) signal based on the input fundamental frequency (F0).
- Parameters:f0 (torch.Tensor) – A tensor of shape (batchsize, length, dim=1) representing the input F0 values. The F0 for unvoiced steps should be set to 0.
- Returns: A tuple containing: : - sine_tensor (torch.Tensor): A tensor of shape (batchsize, length, dim) representing the generated sine waveforms.
- uv (torch.Tensor): A tensor of shape (batchsize, length, 1) indicating voiced/unvoiced segments.
- Return type: tuple
####### Examples
>>> sine_gen = SineGen(samp_rate=44100, harmonic_num=2)
>>> f0_input = torch.tensor([[[440.0]]]) # Example F0 input
>>> sine_wave, uv_signal, noise = sine_gen.forward(f0_input)
>>> print(sine_wave.shape) # Output: torch.Size([1, length, 1])
>>> print(uv_signal.shape) # Output: torch.Size([1, length, 1])
NOTE
The F0 input tensor should have a last dimension of size 1, where unvoiced F0 values must be set to 0.
- Raises:
- ValueError – If the input tensor shape does not match
- (batchsize, length**,** dim**)****.** –