espnet2.gan_svs.visinger2.ddsp.extract_loudness
Less than 1 minute
espnet2.gan_svs.visinger2.ddsp.extract_loudness
espnet2.gan_svs.visinger2.ddsp.extract_loudness(signal, sampling_rate, block_size, n_fft=2048)
Extracts the loudness of an audio signal using Short-Time Fourier Transform (STFT).
This function computes the loudness of an input audio signal by applying the STFT and then applying A-weighting to the frequency bins. The resulting loudness is returned as a numpy array.
- Parameters:
- signal (np.ndarray) – Input audio signal as a 1D numpy array.
- sampling_rate (int) – The sampling rate of the audio signal in Hz.
- block_size (int) – The number of samples per block for the STFT.
- n_fft (int , optional) – The number of FFT points. Defaults to 2048.
- Returns: The computed loudness of the audio signal as a 1D numpy array.
- Return type: np.ndarray
Examples
>>> import numpy as np
>>> signal = np.random.randn(44100) # 1 second of random noise
>>> loudness = extract_loudness(signal, 44100, 1024)
>>> print(loudness.shape) # Should output the shape of the loudness array
NOTE
The loudness values are computed in decibels (dB) relative to the reference level.
- Raises:
- ValueError – If the signal is not a 1D numpy array or if the sampling rate
- is not a positive integer. –