espnet2.gan_svs.visinger2.ddsp.extract_pitch
espnet2.gan_svs.visinger2.ddsp.extract_pitch
espnet2.gan_svs.visinger2.ddsp.extract_pitch(signal, sampling_rate, block_size)
Extract the fundamental frequency (pitch) from an audio signal using the CREPE
algorithm.
This function computes the pitch of an audio signal by utilizing the CREPE algorithm. It processes the input signal in blocks, extracting the fundamental frequency at each block. The resulting pitch is returned as a 1D numpy array.
espnet2.gan_svs.visinger2.ddsp.signal
The input audio signal as a 1D tensor.
- Type: torch.Tensor
espnet2.gan_svs.visinger2.ddsp.sampling_rate
The sampling rate of the audio signal.
- Type: int
espnet2.gan_svs.visinger2.ddsp.block_size
The size of each block for processing the audio signal.
Type: int
Parameters:
- signal (torch.Tensor) – The input audio signal from which to extract the pitch.
- sampling_rate (int) – The sampling rate of the audio signal.
- block_size (int) – The size of the block to divide the signal into for pitch extraction.
Returns: A 1D array containing the extracted pitch values for each : block of the audio signal.
Return type: numpy.ndarray
Raises:ValueError – If the input signal is not a 1D tensor or if the sampling rate or block size is non-positive.
Examples
>>> import torch
>>> signal = torch.randn(16000) # Simulated 1-second audio signal
>>> sampling_rate = 16000 # 16 kHz
>>> block_size = 512 # Block size for processing
>>> pitch = extract_pitch(signal, sampling_rate, block_size)
>>> print(pitch.shape) # Should print the shape of the pitch array
NOTE
The function uses the CREPE algorithm for pitch extraction. Ensure that the necessary dependencies for CREPE are installed and available in your environment.