espnet2.gan_svs.visinger2.ddsp.extract_loudness

Less than 1 minute

espnet2.gan_svs.visinger2.ddsp.extract_loudness

espnet2.gan_svs.visinger2.ddsp.extract_loudness(signal, sampling_rate, block_size, n_fft=2048)

Extracts the loudness of an audio signal using Short-Time Fourier Transform (STFT).

This function computes the loudness of an input audio signal by applying the STFT and then applying A-weighting to the frequency bins. The resulting loudness is returned as a numpy array.

Parameters:
- signal (np.ndarray) – Input audio signal as a 1D numpy array.
- sampling_rate (int) – The sampling rate of the audio signal in Hz.
- block_size (int) – The number of samples per block for the STFT.
- n_fft (int , optional) – The number of FFT points. Defaults to 2048.
Returns: The computed loudness of the audio signal as a 1D numpy array.
Return type: np.ndarray

Examples

>>> import numpy as np
>>> signal = np.random.randn(44100)  # 1 second of random noise
>>> loudness = extract_loudness(signal, 44100, 1024)
>>> print(loudness.shape)  # Should output the shape of the loudness array

NOTE

The loudness values are computed in decibels (dB) relative to the reference level.

Raises:
- ValueError – If the signal is not a 1D numpy array or if the sampling rate
- is not a positive integer. –