espnet2.enh.layers.dc_crn.DC_CRN
About 1 min
espnet2.enh.layers.dc_crn.DC_CRN
class espnet2.enh.layers.dc_crn.DC_CRN(input_dim, input_channels: List = [2, 16, 32, 64, 128, 256], enc_hid_channels=8, enc_kernel_size=(1, 3), enc_padding=(0, 1), enc_last_kernel_size=(1, 4), enc_last_stride=(1, 2), enc_last_padding=(0, 1), enc_layers=5, skip_last_kernel_size=(1, 3), skip_last_stride=(1, 1), skip_last_padding=(0, 1), glstm_groups=2, glstm_layers=2, glstm_bidirectional=False, glstm_rearrange=False, output_channels=2)
Bases: Module
Densely-Connected Convolutional Recurrent Network (DC-CRN).
Reference: Fig. 3 and Section III-B in [1]
- Parameters:
- input_dim (int) – input feature dimension
- input_channels (list) – number of input channels for the stacked DenselyConnectedBlock layers Its length should be (
number of DenselyConnectedBlock layers). It is recommended to use even number of channels to avoid AssertError whenglstm_bidirectional=True. - enc_hid_channels (int) – common number of intermediate channels for all DenselyConnectedBlock of the encoder
- enc_kernel_size (tuple) – common kernel size for all DenselyConnectedBlock of the encoder
- enc_padding (tuple) – common padding for all DenselyConnectedBlock of the encoder
- enc_last_kernel_size (tuple) – common kernel size for the last Conv layer in all DenselyConnectedBlock of the encoder
- enc_last_stride (tuple) – common stride for the last Conv layer in all DenselyConnectedBlock of the encoder
- enc_last_padding (tuple) – common padding for the last Conv layer in all DenselyConnectedBlock of the encoder
- enc_layers (int) – common total number of Conv layers for all DenselyConnectedBlock layers of the encoder
- skip_last_kernel_size (tuple) – common kernel size for the last Conv layer in all DenselyConnectedBlock of the skip pathways
- skip_last_stride (tuple) – common stride for the last Conv layer in all DenselyConnectedBlock of the skip pathways
- skip_last_padding (tuple) – common padding for the last Conv layer in all DenselyConnectedBlock of the skip pathways
- glstm_groups (int) – number of groups in each Grouped LSTM layer
- glstm_layers (int) – number of Grouped LSTM layers
- glstm_bidirectional (bool) – whether to use BLSTM or unidirectional LSTM in Grouped LSTM layers
- glstm_rearrange (bool) – whether to apply the rearrange operation after each grouped LSTM layer
- output_channels (int) – number of output channels (must be an even number to recover both real and imaginary parts)
forward(x)
DC-CRN forward.
- Parameters:x (torch.Tensor) – Concatenated real and imaginary spectrum features (B, input_channels[0], T, F)
- Returns: (B, 2, output_channels, T, F)
- Return type: out (torch.Tensor)
