espnet2.lm.seq_rnn_lm.SequentialRNNLM
espnet2.lm.seq_rnn_lm.SequentialRNNLM
class espnet2.lm.seq_rnn_lm.SequentialRNNLM(vocab_size: int, unit: int = 650, nhid: int | None = None, nlayers: int = 2, dropout_rate: float = 0.0, tie_weights: bool = False, rnn_type: str = 'lstm', ignore_id: int = 0)
Bases: AbsLM
Sequential implementation of Recurrent Neural Network Language Model.
This class implements a Sequential RNN Language Model (RNNLM) using PyTorch. It supports different types of RNNs including LSTM, GRU, and standard RNNs with tanh or ReLU activations. The model can optionally tie the weights of the output layer with the embedding layer, which is a common technique to improve language models.
SEE ALSO
drop
Dropout layer for regularization.
- Type: nn.Dropout
encoder
Embedding layer for input tokens.
- Type: nn.Embedding
rnn
RNN layer (LSTM, GRU, or RNN).
- Type: nn.Module
decoder
Linear layer for output logits.
- Type: nn.Linear
rnn
Type of RNN used (‘LSTM’, ‘GRU’, ‘RNN_TANH’, ‘RNN_RELU’).
- Type: str
nhid
Number of hidden units in the RNN.
- Type: int
nlayers
Number of layers in the RNN.
Type: int
Parameters:
- vocab_size (int) – Size of the vocabulary.
- unit (int) – Number of units in the embedding layer. Default is 650.
- nhid (Optional *[*int ]) – Number of hidden units in the RNN. Defaults to unit.
- nlayers (int) – Number of layers in the RNN. Default is 2.
- dropout_rate (float) – Dropout rate for regularization. Default is 0.0.
- tie_weights (bool) – If True, tie the weights of the encoder and decoder.
- rnn_type (str) – Type of RNN to use (‘lstm’, ‘gru’, ‘rnn_tanh’, ‘rnn_relu’).
- ignore_id (int) – Padding index for the embedding layer. Default is 0.
Raises:ValueError – If an invalid rnn_type is provided or if nhid does not equal unit when tie_weights is True.
############# Examples
Create an instance of the SequentialRNNLM
model = SequentialRNNLM(vocab_size=10000, unit=650, nlayers=2)
Initialize the hidden state
hidden_state = model.zero_state()
Forward pass
input_tensor = torch.randint(0, 10000, (32, 10)) # Batch of 32, seq len 10 output, hidden_state = model(input_tensor, hidden_state)
Scoring a new token
new_token = torch.tensor([5]) # Example token logp, new_state = model.score(new_token, hidden_state, input_tensor)
Batch scoring
prefix_tokens = torch.randint(0, 10000, (32, 5)) # Batch of 32, prefix length 5 scores, next_states = model.batch_score(prefix_tokens, [hidden_state]*32, input_tensor)
Initialize internal Module state, shared by both nn.Module and ScriptModule.
batch_score(ys: Tensor, states: Tensor, xs: Tensor) → Tuple[Tensor, Tensor]
Score new token batch.
- Parameters:
- ys (torch.Tensor) – torch.int64 prefix tokens (n_batch, ylen).
- states (List *[*Any ]) – Scorer states for prefix tokens.
- xs (torch.Tensor) – The encoder feature that generates ys (n_batch, xlen, n_feat).
- Returns: Tuple of : batchfied scores for next token with shape of (n_batch, n_vocab) and next state list for ys.
- Return type: tuple[torch.Tensor, List[Any]]
- Raises:ValueError – If the state format is incorrect or if the model type is unsupported.
############# Examples
>>> model = SequentialRNNLM(vocab_size=1000)
>>> ys = torch.tensor([[1, 2], [3, 4]]) # Example prefix tokens
>>> states = [model.zero_state() for _ in range(2)] # Initial states
>>> xs = torch.randn(2, 10, 300) # Example encoder features
>>> logp, new_states = model.batch_score(ys, states, xs)
>>> print(logp.shape) # Should print torch.Size([2, 1000])
NOTE
This method processes a batch of input tokens and their corresponding states to produce the scores for the next possible tokens.
forward(input: Tensor, hidden: Tensor) → Tuple[Tensor, Tensor]
Perform a forward pass through the RNN layer.
This method computes the forward pass of the RNN language model. It takes the input tensor and the hidden state tensor, processes them through the embedding, RNN, and decoder layers, and returns the decoded output and the updated hidden state.
- Parameters:
- input (torch.Tensor) – Input tensor of shape (batch_size, seq_len) containing token indices.
- hidden (torch.Tensor) – Hidden state tensor of shape (num_layers, batch_size, hidden_size).
- Returns: A tuple containing: : - torch.Tensor: Decoded output tensor of shape (batch_size, seq_len, vocab_size).
- torch.Tensor: Updated hidden state tensor.
- Return type: Tuple[torch.Tensor, torch.Tensor]
############# Examples
>>> model = SequentialRNNLM(vocab_size=1000, unit=650)
>>> input_tensor = torch.randint(0, 1000, (32, 10)) # (batch_size, seq_len)
>>> hidden_state = model.zero_state() # Initialize hidden state
>>> output, new_hidden = model.forward(input_tensor, hidden_state)
>>> output.shape
torch.Size([32, 10, 1000]) # (batch_size, seq_len, vocab_size)
score(y: Tensor, state: Tensor | Tuple[Tensor, Tensor], x: Tensor) → Tuple[Tensor, Tensor | Tuple[Tensor, Tensor]]
Score new token.
This method computes the log probabilities for the next token based on the provided prefix tokens and the current state of the model.
- Parameters:
- y – 1D torch.int64 tensor representing the prefix tokens.
- state – The current scorer state for the prefix tokens, which can either be a tensor or a tuple of tensors (for LSTM).
- x – 2D torch.Tensor representing the encoder features that generate the tokens in y.
- Returns:
- torch.float32 tensor containing the scores for the next token (shape: n_vocab).
- The updated state for the prefix tokens, which will be of the same type as the input state.
- Return type: Tuple of
############# Examples
>>> model = SequentialRNNLM(vocab_size=1000)
>>> prefix_tokens = torch.tensor([1, 2, 3]) # Example prefix tokens
>>> initial_state = model.zero_state() # Initialize state
>>> encoder_features = torch.randn(1, 5, 650) # Example features
>>> scores, new_state = model.score(prefix_tokens, initial_state,
... encoder_features)
zero_state()
Initialize LM state filled with zero values.
This method creates an initial state for the language model (LM) that is filled with zeros. The shape of the state depends on the type of RNN being used. For LSTM networks, the state consists of two tensors representing the hidden state and the cell state. For other RNN types, it returns only the hidden state.
- Returns: A tuple containing the hidden and cell states for LSTM, or a single tensor for other RNN types. The shape of the returned tensor(s) is determined by the number of layers (nlayers) and the number of hidden units (nhid).
- Return type: Union[Tuple[torch.Tensor, torch.Tensor], torch.Tensor]
############# Examples
>>> model = SequentialRNNLM(vocab_size=10000, unit=650, nlayers=2)
>>> initial_state = model.zero_state()
>>> initial_state
(tensor([[0., 0., ..., 0.], [0., 0., ..., 0.]]),
tensor([[0., 0., ..., 0.], [0., 0., ..., 0.]])) # for LSTM
>>> model = SequentialRNNLM(vocab_size=10000, unit=650, nlayers=1,
... rnn_type='RNN_TANH')
>>> initial_state = model.zero_state()
>>> initial_state
tensor([[0., 0., ..., 0.]]) # for RNN