espnet2.enh.layers.beamformer.signal_framing

About 1 min

espnet2.enh.layers.beamformer.signal_framing

espnet2.enh.layers.beamformer.signal_framing(signal: Tensor | ComplexTensor, frame_length: int, frame_step: int, bdelay: int, do_padding: bool = False, pad_value: int = 0, indices: List | None = None) → Tensor | ComplexTensor

Expand signal into several frames, with each frame of length frame_length.

This function divides a given signal into overlapping frames of a specified length and step size. It is particularly useful in speech processing and signal analysis where segmenting the signal into smaller parts is required for further processing. If padding is enabled, the signal can be padded at the beginning to accommodate the specified delay.

Parameters:
- signal (Union *[*torch.Tensor , ComplexTensor ]) – The input signal to be framed with shape (…, T), where T is the length of the signal.
- frame_length (int) – The length of each segment (frame) to be extracted from the signal.
- frame_step (int) – The step size for moving the frame across the signal.
- bdelay (int) – Delay for WPD (Weighted Power Distortionless response).
- do_padding (bool , optional) – Whether or not to pad the input signal at the beginning of the time dimension. Default is False.
- pad_value (int , optional) – The value to fill in the padding if do_padding is True. Default is 0.
- indices (List , optional) – Pre-computed indices for extracting frames. If None, indices will be computed based on frame_length and frame_step.
Returns: If do_padding is True, returns a tensor of shape (…, T, frame_length) where T is the length of the padded signal. If do_padding is False, returns a tensor of shape (…, T - bdelay - frame_length + 2, frame_length), which represents the framed signal segments.
Return type: Union[torch.Tensor, ComplexTensor]

Examples

>>> signal = torch.randn(10)  # Example signal of length 10
>>> frames = signal_framing(signal, frame_length=4, frame_step=2,
...                          bdelay=1)
>>> print(frames.shape)  # Output: (4, 4) for non-padding case

>>> padded_frames = signal_framing(signal, frame_length=4,
...                                frame_step=2, bdelay=1,
...                                do_padding=True)
>>> print(padded_frames.shape)  # Output: (5, 4) for padding case