espnet2.layers.augmentation.corrupt_phase

Less than 1 minute

espnet2.layers.augmentation.corrupt_phase

espnet2.layers.augmentation.corrupt_phase(waveform, sample_rate, scale: float = 0.5, n_fft: float = 0.032, win_length: float | None = None, hop_length: float = 0.008, window: str | None = 'hann')

Add random noise to the phase of the input waveform.

This function modifies the phase of the input audio signal by adding random noise, which can help in data augmentation tasks. It is particularly useful in scenarios where robustness to phase variations is desired.

Parameters:
- waveform (torch.Tensor) – Audio signal (…, time).
- sample_rate (int) – Sampling rate in Hz.
- scale (float) – Scale factor for the phase noise. Default is 0.5.
- n_fft (float) – Length of FFT in seconds. Default is 0.032.
- win_length (float or None) – The window length (in seconds) used for STFT. If None, it is treated as equal to n_fft.
- hop_length (float) – The hop size (in seconds) used for STFT. Default is 0.008.
- window (str or None) – The windowing function applied to the signal after padding with zeros. Default is “hann”.
Returns: Phase-corrupted signal (…, time).
Return type: ret (torch.Tensor)

Examples

>>> waveform = torch.randn(1, 16000)  # Simulated waveform
>>> sample_rate = 16000
>>> noisy_waveform = corrupt_phase(waveform, sample_rate)

NOTE

The random noise added to the phase may alter the perceived audio quality.