espnet2.layers.sinc_conv.BarkScale
espnet2.layers.sinc_conv.BarkScale
class espnet2.layers.sinc_conv.BarkScale
Bases: object
Bark frequency scale.
Has wider bandwidths at lower frequencies, see: Critical bandwidth: BARK Zwicker and Terhardt, 1980.
convert(f)
Convert Hz to Bark.
invert(x)
Convert Bark to Hz.
bank(cls, channels
int, fs: float) -> torch.Tensor: Obtain initialization values for the Bark scale.
- Parameters:
- channels – Number of channels.
- fs – Sample rate.
- Returns: Filter start frequencies. torch.Tensor: Filter stop frequencies.
- Return type: torch.Tensor
######### Examples
>>> fs = 16000
>>> channels = 10
>>> bark_scale = BarkScale.bank(channels, fs)
>>> print(bark_scale)
NOTE
The Bark scale is often used in psychoacoustics and audio signal processing to model human perception of sound.
classmethod bank(channels: int, fs: float) → Tensor
Sinc convolutions.
This module contains implementations of log compression activation, Sinc convolution, and frequency scale conversions (Mel and Bark). Sinc filters function as band passes in the spectral domain, enabling efficient filtering without transforming to the spectral domain.
The Sinc convolution implementation is inspired by Ravanelli et al. and adapted for the ESpnet toolkit. It is recommended to combine Sinc convolutions with a log compression activation function for optimal performance.
Notes: Currently, the same filters are applied to all input channels. The windowing function is applied on the kernel to obtain a smoother filter, differing from traditional ASR approaches.
Classes:
- LogCompression: Implements the log compression activation function.
- SincConv: Performs convolution using Sinc filters.
- MelScale: Provides methods for converting between Hz and Mel scale.
- BarkScale: Provides methods for converting between Hz and Bark scale.
Example usage: : # Initialize Sinc convolution layer sinc_layer = SincConv(in_channels=1, out_channels=10, kernel_size=31) <br/>
Apply Sinc convolution to a tensor
input_tensor = torch.randn(5, 1, 100) # (B, C_in, D_in) output_tensor = sinc_layer(input_tensor) # (B, C_out, D_out)
static convert(f)
Convert frequencies between Hz and Bark scale.
This class provides methods to convert frequencies from Hertz (Hz) to Bark scale and vice versa. The Bark scale is a psychoacoustic scale that reflects human perception of sound frequencies, with wider bandwidths at lower frequencies.
None
- Parameters:
- channels – Number of channels.
- fs – Sample rate.
- Returns: Filter start frequencies. torch.Tensor: Filter stop frequencies.
- Return type: torch.Tensor
######### Examples
>>> bark_scale = BarkScale()
>>> bark_freq = bark_scale.convert(torch.tensor([1000.0, 2000.0]))
>>> hz_freq = bark_scale.invert(bark_freq)
>>> assert torch.allclose(hz_freq, torch.tensor([1000.0, 2000.0]))
static invert(x)
Invert the Bark frequency scale.
This method converts values from the Bark scale back to Hertz (Hz).
- Parameters:x (torch.Tensor) – A tensor containing values in the Bark scale.
- Returns: A tensor containing the corresponding frequencies in Hz.
- Return type: torch.Tensor
######### Examples
>>> bark_values = torch.tensor([25.0, 100.0, 200.0])
>>> hz_values = BarkScale.invert(bark_values)
>>> print(hz_values)
tensor([ 70.0, 250.0, 500.0])