espnet2.legacy.nets.pytorch_backend.transformer.encoder_layer.EncoderLayer
Less than 1 minute
espnet2.legacy.nets.pytorch_backend.transformer.encoder_layer.EncoderLayer
class espnet2.legacy.nets.pytorch_backend.transformer.encoder_layer.EncoderLayer(size, self_attn, feed_forward, dropout_rate, normalize_before=True, concat_after=False, stochastic_depth_rate=0.0)
Bases: Module
Encoder layer module.
- Parameters:
- size (int) β Input dimension.
- self_attn (torch.nn.Module) β Self-attention module instance. MultiHeadedAttention or RelPositionMultiHeadedAttention instance can be used as the argument.
- feed_forward (torch.nn.Module) β Feed-forward module instance. PositionwiseFeedForward, MultiLayeredConv1d, or Conv1dLinear instance can be used as the argument.
- dropout_rate (float) β Dropout rate.
- normalize_before (bool) β Whether to use layer_norm before the first block.
- concat_after (bool) β Whether to concat attention layerβs input and output. if True, additional linear will be applied. i.e. x -> x + linear(concat(x, att(x))) if False, no additional linear will be applied. i.e. x -> x + att(x)
- stochastic_depth_rate (float) β Proability to skip this layer. During training, the layer may skip residual computation and return input as-is with given probability.
Construct an EncoderLayer object.
forward(x, mask, cache=None)
Compute encoded features.
- Parameters:
- x_input (torch.Tensor) β Input tensor (#batch, time, size).
- mask (torch.Tensor) β Mask tensor for the input (#batch, 1, time).
- cache (torch.Tensor) β Cache tensor of the input (#batch, time - 1, size).
- Returns: Output tensor (#batch, time, size). torch.Tensor: Mask tensor (#batch, 1, time).
- Return type: torch.Tensor
