espnet2.asr.encoder.beats_encoder.init_bert_params

Less than 1 minute

espnet2.asr.encoder.beats_encoder.init_bert_params

espnet2.asr.encoder.beats_encoder.init_bert_params(module)

Initialize the weights specific to the BERT model.

This function overrides the default weight initializations based on the specified arguments for various layer types, including linear, embedding, and multi-head attention layers. The initialization is done using a normal distribution with a mean of 0.0 and a standard deviation of 0.02.

Parameters:
- module (nn.Module) – The PyTorch module (e.g., Linear, Embedding,
- initialized. (MultiheadAttention ) whose weights are to be)

Notes

For linear layers, weights are initialized with a normal distribution, and biases are set to zero.
For embedding layers, weights are also initialized with a normal distribution, and padding indices (if any) are set to zero.
For multi-head attention layers, the weights for query, key, and value projections are initialized using the same normal distribution.

Examples

>>> linear_layer = nn.Linear(10, 5)
>>> init_bert_params(linear_layer)
>>> assert linear_layer.weight.data.mean() == 0.0  # mean should be near 0

>>> embedding_layer = nn.Embedding(10, 5)
>>> init_bert_params(embedding_layer)
>>> assert embedding_layer.weight.data[0].sum() == 0.0  # padding idx should be zero

>>> attention_layer = MultiheadAttention(embed_dim=5, num_heads=2)
>>> init_bert_params(attention_layer)
>>> assert attention_layer.q_proj.weight.data.mean() == 0.0  # mean should be near 0