espnet2.diar.layers.tcn_nomask.DepthwiseSeparableConv
espnet2.diar.layers.tcn_nomask.DepthwiseSeparableConv
class espnet2.diar.layers.tcn_nomask.DepthwiseSeparableConv(in_channels, out_channels, kernel_size, stride, padding, dilation, norm_type='gLN', causal=False)
Bases: Module
Depthwise Separable Convolution Layer.
This class implements a depthwise separable convolution, which consists of a depthwise convolution followed by a pointwise convolution. The depthwise convolution applies a single filter per input channel, while the pointwise convolution mixes the outputs of the depthwise layer.
- Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Size of the convolutional kernel.
- stride (int) – Stride of the convolution.
- padding (int) – Padding added to both sides of the input.
- dilation (int) – Dilation factor for the convolution.
- norm_type (str) – Type of normalization to use (‘gLN’, ‘cLN’, ‘BN’).
- causal (bool) – If True, applies causal convolution (output at time t does not depend on future time steps).
net
Sequential container of depthwise and pointwise convolutions along with normalization and activation.
Type: nn.Sequential
Returns: Output tensor after applying the depthwise separable : convolution.
Return type: result (Tensor)
####### Examples
>>> model = DepthwiseSeparableConv(in_channels=64, out_channels=128,
... kernel_size=3, stride=1, padding=1,
... dilation=1, norm_type='gLN', causal=False)
>>> input_tensor = torch.randn(32, 64, 100) # Batch of 32, 64 channels, 100 length
>>> output_tensor = model(input_tensor)
>>> output_tensor.shape
torch.Size([32, 128, 100]) # Output will have 128 channels
NOTE
The depthwise convolution is performed with groups set to in_channels to achieve depthwise separability. If causal is set to True, a Chomp layer is used to ensure the output length matches the input length.
- Raises:ValueError – If an unsupported normalization type is specified.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(x)
Forward pass for the TemporalConvNet.
This method takes the input tensor mixture_w and passes it through the temporal convolution network to extract bottleneck features. The expected input shape is [M, N, K], where M is the batch size, N is the number of filters in the autoencoder, and K is the sequence length. The output will be of shape [M, B, K], where B is the number of channels in the bottleneck 1x1-conv block.
- Parameters:mixture_w – A tensor of shape [M, N, K], representing the input mixture signal, where M is the batch size, N is the number of channels, and K is the length of the signal.
- Returns: A tensor of shape [M, B, K], representing the : extracted bottleneck features after passing through the network.
- Return type: bottleneck_feature
####### Examples
>>> model = TemporalConvNet(N=64, B=32, H=128, P=3, X=4, R=3)
>>> mixture = torch.randn(10, 64, 100) # Batch of 10, 64 channels, length 100
>>> output = model(mixture)
>>> print(output.shape)
torch.Size([10, 32, 100]) # Output shape should match [M, B, K]
NOTE
Ensure that the input tensor mixture_w is correctly shaped according to the specifications to avoid dimension mismatch errors.