espnet2.enh.layers.swin_transformer.BasicLayer

Less than 1 minute

espnet2.enh.layers.swin_transformer.BasicLayer

source

class espnet2.enh.layers.swin_transformer.BasicLayer(dim, input_resolution, depth, num_heads, window_size, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, use_checkpoint=False)

Bases: Module

A basic Swin Transformer layer for one stage.

Parameters:
- dim (int) – Number of input channels.
- input_resolution (tuple *[*int ]) – Input resolution.
- depth (int) – Number of blocks.
- num_heads (int) – Number of attention heads.
- window_size (int) – Local window size.
- mlp_ratio (float) – Ratio of MLP hidden dim to embedding dim.
- qkv_bias (bool , optional) – If True, add a learnable bias to query, key, value.
- qk_scale (float | None , optional) – If not None, override default qk scale.
- drop (float , optional) – Dropout rate.
- attn_drop (float , optional) – Attention dropout rate.
- drop_path (float | tuple *[*float ] , optional) – Stochastic depth rate.
- norm_layer (nn.Module , optional) – Normalization layer.
- use_checkpoint (bool) – Whether to use checkpointing to save memory.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

extra_repr() → str

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward(x, x_size)

BasicLayer Forward.

Parameters:
- x (Tensor) – Input feature with shape (B, H x W, C).
- x_size (tuple *[*int ]) – Heigth and width of the input feature (H, W).