espnet2.enh.layers.complexnn.ComplexConv2d

About 2 min

espnet2.enh.layers.complexnn.ComplexConv2d

class espnet2.enh.layers.complexnn.ComplexConv2d(in_channels, out_channels, kernel_size=(1, 1), stride=(1, 1), padding=(0, 0), dilation=1, groups=1, causal=True, complex_axis=1)

Bases: Module

Complex 2D convolution layer for processing complex-valued inputs.

This layer performs convolution on complex inputs, where the input is represented as two separate channels: real and imaginary parts. It applies two separate 2D convolution operations, one for the real part and one for the imaginary part, and combines the results to produce the output.

in_channels

Number of input channels (real + imag).

Type: int

out_channels

Number of output channels (real + imag).

Type: int

kernel_size

Size of the convolution kernel.

Type: tuple

stride

Stride of the convolution.

Type: tuple

padding

Padding applied to the input.

Type: tuple

causal

If True, applies causal padding on the time dimension.

Type: bool

groups

Number of blocked connections from input channels to output channels.

Type: int

dilation

Spacing between kernel elements.

Type: int

complex_axis

Axis along which the real and imaginary parts are concatenated.

Type: int
Parameters:
- in_channels (int) – Number of input channels (real + imag).
- out_channels (int) – Number of output channels (real + imag).
- kernel_size (tuple , optional) – Size of the convolution kernel (default: (1, 1)).
- stride (tuple , optional) – Stride of the convolution (default: (1, 1)).
- padding (tuple , optional) – Padding applied to the input (default: (0, 0)).
- dilation (int , optional) – Spacing between kernel elements (default: 1).
- groups (int , optional) – Number of blocked connections from input channels to output channels (default: 1).
- causal (bool , optional) – If True, applies causal padding on the time dimension (default: True).
- complex_axis (int , optional) – Axis along which the real and imaginary parts are concatenated (default: 1).
Returns: Output tensor containing concatenated real and imaginary parts after convolution.
Return type: torch.Tensor

####### Examples

>>> import torch
>>> conv = ComplexConv2d(in_channels=4, out_channels=8, kernel_size=(3, 3))
>>> input_tensor = torch.randn(1, 4, 10, 10)  # Shape: [B, C, D, T]
>>> output = conv(input_tensor)
>>> output.shape
torch.Size([1, 8, 8, 8])  # Output shape after convolution

NOTE

The input tensor is expected to have a shape of [B, C, D, T], where B is the batch size, C is the number of channels (real + imag), D is the height, and T is the width of the input.

ComplexConv2d.

in_channels: real+imag out_channels: real+imag kernel_size : input [B,C,D,T] kernel size in [D,T] padding : input [B,C,D,T] padding in [D,T] causal: if causal, will padding time dimension’s left side,

otherwise both

forward(inputs)

Forward pass for the ComplexConv2d layer.

This method performs the forward computation of the complex convolution layer, applying separate convolutions for the real and imaginary parts of the input. The results from the real and imaginary convolutions are then combined to produce the final output.

Parameters:inputs (torch.Tensor) – Input tensor of shape [B, C, D, T], where C represents the complex channels (real and imaginary), B is the batch size, D is the depth, and T is the time dimension.
Returns: Output tensor of the same shape as the input, after applying the complex convolution operation.
Return type: torch.Tensor

NOTE

If self.padding[1] is not zero and self.causal is True, the input is padded on the left side of the time dimension. Otherwise, it is padded symmetrically.

####### Examples

>>> conv = ComplexConv2d(in_channels=4, out_channels=8, kernel_size=(3, 3))
>>> input_tensor = torch.randn(1, 4, 10, 10)  # Example input
>>> output = conv(input_tensor)
>>> output.shape
torch.Size([1, 8, 10, 10])  # Output shape after convolution