espnet2.asr_transducer.frontend.online_audio_processor.OnlineAudioProcessor
Less than 1 minute
espnet2.asr_transducer.frontend.online_audio_processor.OnlineAudioProcessor
class espnet2.asr_transducer.frontend.online_audio_processor.OnlineAudioProcessor(feature_extractor: Module, normalization_module: Module, decoding_window: int, encoder_sub_factor: int, frontend_conf: Dict, device: device, audio_sampling_rate: int = 16000)
Bases: object
OnlineProcessor module definition.
- Parameters:
- feature_extractor β Feature extractor module.
- normalization_module β Normalization module.
- decoding_window β Size of the decoding window (in ms).
- encoder_sub_factor β Encoder subsampling factor.
- frontend_conf β Frontend configuration.
- device β Device to pin module tensors on.
- audio_sampling_rate β Input sampling rate.
Construct an OnlineAudioProcessor.
compute_features(samples: Tensor, is_final: bool) β None
Compute features from input samples.
- Parameters:
- samples β Speech data. (S)
- is_final β Whether speech corresponds to the final chunk of data.
- Returns: Features sequence. (1, chunk_sz_bs, D_feats) feats_length: Features length sequence. (1,)
- Return type: feats
get_current_feats(feats: Tensor, feats_length: Tensor, is_final: bool) β Tuple[Tensor, Tensor]
Get features for current decoding window.
- Parameters:
- feats β Computed features sequence. (1, F, D_feats)
- feats_length β Computed features sequence length. (1,)
- is_final β Whether feats corresponds to the final chunk of data.
- Returns: Decoding window features sequence. (1, chunk_sz_bs, D_feats) feats_length: Decoding window features length sequence. (1,)
- Return type: feats
get_current_samples(samples: Tensor, is_final: bool) β Tensor
Get samples for feature computation.
- Parameters:
- samples β Speech data. (S)
- is_final β Whether speech corresponds to the final chunk of data.
- Returns: New speech data. (1, decoding_samples)
- Return type: samples
reset_cache() β None
Reset cache parameters.
- Parameters:None
- Returns: None
