espnet2.spk.encoder.ska_tdnn_encoder.Bottle2neck
espnet2.spk.encoder.ska_tdnn_encoder.Bottle2neck
class espnet2.spk.encoder.ska_tdnn_encoder.Bottle2neck(inplanes, planes, kernel_size=None, kernel_sizes=[5, 7], dilation=None, scale=8, group=1)
Bases: Module
Bottle2neck module for SKA-TDNN architecture.
This module implements a bottleneck layer with selective kernel attention, allowing for adaptive feature extraction through multiple convolutional kernels. It utilizes a squeeze-and-excitation mechanism to enhance the representation power of the network.
- Parameters:
- inplanes (int) – Number of input channels.
- planes (int) – Number of output channels.
- kernel_size (int , optional) – Size of the convolution kernel. Defaults to None.
- kernel_sizes (list of int , optional) – List of kernel sizes for the selective kernel convolution. Defaults to [5, 7].
- dilation (int , optional) – Dilation rate for the convolution. Defaults to None.
- scale (int , optional) – Scaling factor for the width of the bottleneck. Defaults to 8.
- group (int , optional) – Number of groups for grouped convolution. Defaults to 1.
conv1
First convolutional layer.
- Type: nn.Conv1d
relu
ReLU activation function.
- Type: nn.ReLU
bn1
Batch normalization layer.
- Type: nn.BatchNorm1d
nums
Number of selective kernel convolutions.
- Type: int
skconvs
List of selective kernel convolution modules.
- Type: nn.ModuleList
skse
Selective kernel attention module.
- Type:SKAttentionModule
conv3
Second convolutional layer.
- Type: nn.Conv1d
bn3
Batch normalization layer.
- Type: nn.BatchNorm1d
se
Squeeze-and-excitation module.
- Type:SEModule
width
Width of the bottleneck.
Type: int
Returns: Output tensor after applying the bottleneck operation.
Return type: out (Tensor)
##
Example
>>> model = Bottle2neck(inplanes=64, planes=128)
>>> x = torch.randn(32, 64, 100) # (batch_size, channels, sequence_length)
>>> output = model(x)
>>> output.shape
torch.Size([32, 128, 100])
NOTE
This module is designed to work within the SKA-TDNN architecture and expects inputs of the shape (batch_size, inplanes, sequence_length).
- Raises:ValueError – If kernel_size is provided but not valid.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(x)
Computes the forward pass of the Bottle2neck module.
This method processes the input tensor x through a series of convolutional layers, applies a skip connection, and returns the final output tensor. The forward operation consists of several stages including initial convolution, ReLU activation, batch normalization, and a series of attention mechanisms.
- Parameters:x (torch.Tensor) – The input tensor of shape (B, C, T) where B is the batch size, C is the number of channels, and T is the sequence length.
- Returns: The output tensor of shape (B, planes, T) : after applying the series of transformations.
- Return type: torch.Tensor
Example
>>> model = Bottle2neck(inplanes=64, planes=128)
>>> input_tensor = torch.randn(32, 64, 100) # Batch size of 32
>>> output_tensor = model(input_tensor)
>>> print(output_tensor.shape)
torch.Size([32, 128, 100])
NOTE
This method relies on the internal layers defined in the Bottle2neck class and the proper initialization of those layers in the constructor.