espnet2.layers.augmentation.corrupt_phase
Less than 1 minute
espnet2.layers.augmentation.corrupt_phase
espnet2.layers.augmentation.corrupt_phase(waveform, sample_rate, scale: float = 0.5, n_fft: float = 0.032, win_length: float | None = None, hop_length: float = 0.008, window: str | None = 'hann')
Add random noise to the phase of the input waveform.
This function modifies the phase of the input audio signal by adding random noise, which can help in data augmentation tasks. It is particularly useful in scenarios where robustness to phase variations is desired.
- Parameters:
- waveform (torch.Tensor) – Audio signal (…, time).
- sample_rate (int) – Sampling rate in Hz.
- scale (float) – Scale factor for the phase noise. Default is 0.5.
- n_fft (float) – Length of FFT in seconds. Default is 0.032.
- win_length (float or None) – The window length (in seconds) used for STFT. If None, it is treated as equal to n_fft.
- hop_length (float) – The hop size (in seconds) used for STFT. Default is 0.008.
- window (str or None) – The windowing function applied to the signal after padding with zeros. Default is “hann”.
- Returns: Phase-corrupted signal (…, time).
- Return type: ret (torch.Tensor)
Examples
>>> waveform = torch.randn(1, 16000) # Simulated waveform
>>> sample_rate = 16000
>>> noisy_waveform = corrupt_phase(waveform, sample_rate)
NOTE
The random noise added to the phase may alter the perceived audio quality.