espnet2.asr.encoder.avhubert_encoder.downsample_basic_block_v2

Less than 1 minute

espnet2.asr.encoder.avhubert_encoder.downsample_basic_block_v2

espnet2.asr.encoder.avhubert_encoder.downsample_basic_block_v2(inplanes, outplanes, stride)

Construct a downsample block for use in a neural network.

This function creates a sequential block consisting of an average pooling layer followed by a 1x1 convolutional layer and batch normalization. It is used to reduce the spatial dimensions of the input feature maps while increasing the number of output channels.

Parameters:
- inplanes (int) – Number of input channels.
- outplanes (int) – Number of output channels.
- stride (int) – Stride for the average pooling layer, which determines the downsampling factor.
Returns: A sequential block containing the average pooling layer, convolutional layer, and batch normalization layer.
Return type: nn.Sequential

Examples

>>> downsample_block = downsample_basic_block_v2(64, 128, stride=2)
>>> print(downsample_block)
Sequential(
  (0): AvgPool2d(kernel_size=2, stride=2, padding=0)
  (1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
  (2): BatchNorm2d(128)
)