espnet2.asr.decoder.s4_decoder.S4Decoder
espnet2.asr.decoder.s4_decoder.S4Decoder
class espnet2.asr.decoder.s4_decoder.S4Decoder(vocab_size: int, encoder_output_size: int, input_layer: str = 'embed', dropinp: float = 0.0, dropout: float = 0.25, prenorm: bool = True, n_layers: int = 16, transposed: bool = False, tie_dropout: bool = False, n_repeat=1, layer=None, residual=None, norm=None, pool=None, track_norms=True, drop_path: float = 0.0)
Bases: AbsDecoder, BatchScorerInterface
S4 decoder module.
- Parameters:
- vocab_size β output dim
- encoder_output_size β dimension of hidden vector
- input_layer β input layer type
- dropinp β input dropout
- dropout β dropout parameter applied on every residual and every layer
- prenorm β pre-norm vs. post-norm
- n_layers β number of layers
- transposed β transpose inputs so each layer receives (batch, dim, length)
- tie_dropout β tie dropout mask across sequence like nn.Dropout1d/nn.Dropout2d
- n_repeat β each layer is repeated n times per stage before applying pooling
- layer β layer config, must be specified
- residual β residual config
- norm β normalization config (e.g. layer vs batch)
- pool β config for pooling layer per stage
- track_norms β log norms of each layer output
- drop_path β drop rate for stochastic depth
Initialize internal Module state, shared by both nn.Module and ScriptModule.
batch_score(ys: Tensor, states: List[Any], xs: Tensor) β Tuple[Tensor, List[Any]]
Score new token batch.
- Parameters:
- ys (torch.Tensor) β torch.int64 prefix tokens (n_batch, ylen).
- states (List *[*Any ]) β Scorer states for prefix tokens.
- xs (torch.Tensor) β The encoder feature that generates ys (n_batch, xlen, n_feat).
- Returns: Tuple of : batchfied scores for next token with shape of (n_batch, n_vocab) and next state list for ys.
- Return type: tuple[torch.Tensor, List[Any]]
forward(hs_pad: Tensor, hlens: Tensor, ys_in_pad: Tensor, ys_in_lens: Tensor, state=None) β Tuple[Tensor, Tensor]
Forward decoder.
Parameters:
- hs_pad β encoded memory, float32 (batch, maxlen_in, feat)
- hlens β (batch)
- ys_in_pad β input token ids, int64 (batch, maxlen_out) if input_layer == βembedβ input tensor (batch, maxlen_out, #mels) in the other cases
- ys_in_lens β (batch)
Returns: tuple containing:
x: decoded token score before softmax (batch, maxlen_out, token) : if use_output_layer is True,
olens: (batch, )
Return type: (tuple)
init_state(x: Tensor)
Initialize state.
score(ys, state, x)
Score new token (required).
- Parameters:
- y (torch.Tensor) β 1D torch.int64 prefix tokens.
- state β Scorer state for prefix tokens
- x (torch.Tensor) β The encoder feature that generates ys.
- Returns: Tuple of : scores for next token that has a shape of (n_vocab) and next state for ys
- Return type: tuple[torch.Tensor, Any]
