espnet2.layers.time_warp.TimeWarp

About 2 min

espnet2.layers.time_warp.TimeWarp

class espnet2.layers.time_warp.TimeWarp(window: int = 80, mode: str = 'bicubic')

Bases: Module

Time warp module for temporal interpolation of audio features using PyTorch.

This module provides functionality for time warping audio feature tensors using various interpolation modes. It includes both a standalone function for time warping and a PyTorch module that can be integrated into a neural network pipeline.

The time_warp function performs time warping on a given tensor, while the TimeWarp class allows for more flexible integration and usage within a neural network model.

Parameters:
- x (torch.Tensor) – Input tensor of shape (Batch, Time, Freq).
- window (int , optional) – Time warp parameter that controls the extent of warping. Default is 80.
- mode (str , optional) – Interpolation mode. Default is “bicubic”.
Returns: The time-warped tensor of the same shape as input.
Return type: torch.Tensor

######### Examples

Using the standalone function

import torch x = torch.randn(2, 100, 64) # Example input tensor warped_x = time_warp(x, window=80, mode=’linear’)

Using the TimeWarp module in a model

time_warp_layer = TimeWarp(window=80, mode=’linear’) output, lengths = time_warp_layer(x)

NOTE

The time_warp function can handle tensors of shape (Batch, Time, Freq) or (Batch, 1, Time, Freq) when using bicubic interpolation. It will reshape the input tensor accordingly.

Raises:ValueError – If the input tensor’s time dimension is not sufficient for the specified window size.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

extra_repr()

Returns a string representation of the TimeWarp module’s parameters.

This method is used to provide additional information about the instance of the TimeWarp class when printed. It includes the window size and interpolation mode that are set during initialization.

window

The time warp parameter.

Type: int

mode

The interpolation mode used for warping.

Type: str
Returns: A formatted string containing the window and mode values.
Return type: str

######### Examples

>>> tw = TimeWarp(window=100, mode='linear')
>>> print(tw.extra_repr())
window=100, mode=linear

forward(x: Tensor, x_lengths: Tensor | None = None)

Forward function for the TimeWarp module.

This method applies time warping to the input tensor using the specified window size and interpolation mode. It supports variable-length input sequences, ensuring that the same warping is applied to each sample when lengths are uniform.

Parameters:
- x – A tensor of shape (Batch, Time, Freq) representing the input data.
- x_lengths – A tensor of shape (Batch,) containing the lengths of each sequence in the batch. If None, the same warping is applied to each sample.
Returns:
- A tensor with the warped output of shape (Batch, Time, Freq).
- A tensor with the lengths of each sequence in the batch.
Return type: A tuple containing

######### Examples

Example usage with uniform lengths

import torch model = TimeWarp(window=80, mode=’bicubic’) input_tensor = torch.randn(10, 100, 40) # (Batch, Time, Freq) output_tensor, lengths = model(input_tensor)

Example usage with variable lengths

input_tensor_var_len = torch.randn(5, 120, 40) # (Batch, Time, Freq) lengths = torch.tensor([100, 90, 80, 70, 60]) # Variable lengths output_tensor_var_len, lengths_var_len = model(input_tensor_var_len, lengths)