espnet2.enh.encoder.conv_encoder.ConvEncoder
Less than 1 minute
espnet2.enh.encoder.conv_encoder.ConvEncoder
class espnet2.enh.encoder.conv_encoder.ConvEncoder(channel: int, kernel_size: int, stride: int)
Bases: AbsEncoder
Convolutional encoder for speech enhancement and separation
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(input: Tensor, ilens: Tensor, fs: int = None)
Forward.
- Parameters:
- input (torch.Tensor) β mixed speech [Batch, sample]
- ilens (torch.Tensor) β input lengths [Batch]
- fs (int) β sampling rate in Hz (Not used)
- Returns: mixed feature after encoder [Batch, flens, channel]
- Return type: feature (torch.Tensor)
forward_streaming(input: Tensor)
property output_dim : int
streaming_frame(audio: Tensor)
Stream frame.
It splits the continuous audio into frame-level audio chunks in the streaming simulation. It is noted that this function takes the entire long audio as input for a streaming simulation. You may refer to this function to manage your streaming input buffer in a real streaming application.
- Parameters:audio β (B, T)
- Returns: List [(B, frame_size),]
- Return type: chunked
