espnet2.asr.ctc.CTC
espnet2.asr.ctc.CTC
class espnet2.asr.ctc.CTC(odim: int, encoder_output_size: int, dropout_rate: float = 0.0, ctc_type: str = 'builtin', reduce: bool = True, ignore_nan_grad: bool | None = None, zero_infinity: bool = True, brctc_risk_strategy: str = 'exp', brctc_group_strategy: str = 'end', brctc_risk_factor: float = 0.0)
Bases: Module
CTC module.
- Parameters:
- odim β dimension of outputs
- encoder_output_size β number of encoder projection units
- dropout_rate β dropout rate (0.0 ~ 1.0)
- ctc_type β builtin or gtnctc
- reduce β reduce the CTC loss into a scalar
- ignore_nan_grad β Same as zero_infinity (keeping for backward compatiblity)
- zero_infinity β Whether to zero infinite losses and the associated gradients.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
argmax(hs_pad)
argmax of frame activations
- Parameters:hs_pad (torch.Tensor) β 3d tensor (B, Tmax, eprojs)
- Returns: argmax applied 2d tensor (B, Tmax)
- Return type: torch.Tensor
forced_align(hs_pad, hlens, ys_pad, ys_lens, blank_idx=0)
Force alignment between input and target sequences (Viterbi path).
- Parameters:
- hs_pad β batch of padded hidden state sequences (B, Tmax, D)
- hlens β batch of lengths of hidden state sequences (B)
- ys_pad β batch of padded character id sequence tensor (B, Lmax)
- ys_lens β batch of lengths of character sequence (B)
- blank_idx β index of blank symbol
- Note β B must be 1.
- Returns: Tuple(tensor, tensor): : - Label for each time step in the alignment path computed <br/> using forced alignment.
- Log probability scores of the labels for each time step.
- Return type: alignments
forward(hs_pad, hlens, ys_pad, ys_lens)
Calculate CTC loss.
- Parameters:
- hs_pad β batch of padded hidden state sequences (B, Tmax, D)
- hlens β batch of lengths of hidden state sequences (B)
- ys_pad β batch of padded character id sequence tensor (B, Lmax)
- ys_lens β batch of lengths of character sequence (B)
log_softmax(hs_pad)
log_softmax of frame activations
- Parameters:hs_pad (Tensor) β 3d tensor (B, Tmax, eprojs)
- Returns: log softmax applied 3d tensor (B, Tmax, odim)
- Return type: torch.Tensor
loss_fn(th_pred, th_target, th_ilen, th_olen) β Tensor
softmax(hs_pad)
softmax of frame activations
- Parameters:hs_pad (Tensor) β 3d tensor (B, Tmax, eprojs)
- Returns: softmax applied 3d tensor (B, Tmax, odim)
- Return type: torch.Tensor
