espnet2.gan_svs.visinger2.ddsp.extract_pitch

Less than 1 minute

espnet2.gan_svs.visinger2.ddsp.extract_pitch

espnet2.gan_svs.visinger2.ddsp.extract_pitch(signal, sampling_rate, block_size)

Extract the fundamental frequency (pitch) from an audio signal using the CREPE

algorithm.

This function computes the pitch of an audio signal by utilizing the CREPE algorithm. It processes the input signal in blocks, extracting the fundamental frequency at each block. The resulting pitch is returned as a 1D numpy array.

espnet2.gan_svs.visinger2.ddsp.signal

The input audio signal as a 1D tensor.

Type: torch.Tensor

espnet2.gan_svs.visinger2.ddsp.sampling_rate

The sampling rate of the audio signal.

Type: int

espnet2.gan_svs.visinger2.ddsp.block_size

The size of each block for processing the audio signal.

Type: int
Parameters:
- signal (torch.Tensor) – The input audio signal from which to extract the pitch.
- sampling_rate (int) – The sampling rate of the audio signal.
- block_size (int) – The size of the block to divide the signal into for pitch extraction.
Returns: A 1D array containing the extracted pitch values for each : block of the audio signal.
Return type: numpy.ndarray
Raises:ValueError – If the input signal is not a 1D tensor or if the sampling rate or block size is non-positive.

Examples

>>> import torch
>>> signal = torch.randn(16000)  # Simulated 1-second audio signal
>>> sampling_rate = 16000  # 16 kHz
>>> block_size = 512  # Block size for processing
>>> pitch = extract_pitch(signal, sampling_rate, block_size)
>>> print(pitch.shape)  # Should print the shape of the pitch array

NOTE

The function uses the CREPE algorithm for pitch extraction. Ensure that the necessary dependencies for CREPE are installed and available in your environment.