espnet2.asr_transducer.decoder.modules.mega.positional_bias.RelativePositionBias
espnet2.asr_transducer.decoder.modules.mega.positional_bias.RelativePositionBias
class espnet2.asr_transducer.decoder.modules.mega.positional_bias.RelativePositionBias(max_positions: int)
Bases: Module
Positional bias related modules.
Based/modified from https://github.com/facebookresearch/mega/blob/main/fairseq/modules/relative_positional_bias.py
This module defines classes for implementing relative position bias in neural networks. Relative position bias helps models leverage positional information when processing sequences, particularly in tasks like natural language processing and speech recognition.
Classes: : - RelativePositionBias: Computes relative position bias for sequences.
- RotaryRelativePositionBias: Computes rotary positional embeddings for <br/> sequences.
max_positions
Maximum number of relative positions.
Construct a RelativePositionBias object.
forward(length: int) → Tensor
Compute rotary relative position bias.
This method computes the rotary relative position bias based on the input sequence length. It uses the pre-computed sine and cosine embeddings to create a bias matrix, which can be used in attention mechanisms to incorporate relative position information.
- Parameters:length – Sequence length. This should not exceed the maximum number of relative positions specified during the initialization of the module.
- Returns: Rotary relative position bias. The output shape is (L, L), where : L is the provided sequence length.
- Return type: bias
- Raises:ValueError – If the provided length exceeds the maximum positions supported by the module.
####### Examples
>>> rotary_bias = RotaryRelativePositionBias(size=64, max_positions=2048)
>>> bias = rotary_bias.forward(length=10)
>>> print(bias.shape)
torch.Size([10, 10])
NOTE
This method utilizes the rotary positional embeddings calculated using the rotary method, which combines the alpha and beta parameters with sine and cosine functions to generate the final bias.
reset_parameters(val: float = 0.0, std: float = 0.02) → None
Reset module parameters.
This method initializes the parameters of the RelativePositionBias module using a normal distribution with the specified mean and standard deviation. It is typically called during the initialization of the module to ensure that the parameters are set to a reasonable starting point.
- Parameters:
- val – Initialization value (mean of the normal distribution).
- std – Standard deviation of the normal distribution.
####### Examples
>>> rp_bias = RelativePositionBias(max_positions=10)
>>> rp_bias.reset_parameters(val=0.5, std=0.1)
>>> rp_bias.relative_position_bias
tensor([...]) # Normal distribution values around 0.5 with std 0.1
NOTE
This method can be called multiple times to reinitialize the parameters if needed.