espnet2.lm.seq_rnn_lm.SequentialRNNLM

About 4 min

espnet2.lm.seq_rnn_lm.SequentialRNNLM

class espnet2.lm.seq_rnn_lm.SequentialRNNLM(vocab_size: int, unit: int = 650, nhid: int | None = None, nlayers: int = 2, dropout_rate: float = 0.0, tie_weights: bool = False, rnn_type: str = 'lstm', ignore_id: int = 0)

Bases: AbsLM

Sequential implementation of Recurrent Neural Network Language Model.

This class implements a Sequential RNN Language Model (RNNLM) using PyTorch. It supports different types of RNNs including LSTM, GRU, and standard RNNs with tanh or ReLU activations. The model can optionally tie the weights of the output layer with the embedding layer, which is a common technique to improve language models.

Create an instance of the SequentialRNNLM

model = SequentialRNNLM(vocab_size=10000, unit=650, nlayers=2)

Initialize the hidden state

hidden_state = model.zero_state()

Forward pass

input_tensor = torch.randint(0, 10000, (32, 10)) # Batch of 32, seq len 10 output, hidden_state = model(input_tensor, hidden_state)

Scoring a new token

new_token = torch.tensor([5]) # Example token logp, new_state = model.score(new_token, hidden_state, input_tensor)

Batch scoring

prefix_tokens = torch.randint(0, 10000, (32, 5)) # Batch of 32, prefix length 5 scores, next_states = model.batch_score(prefix_tokens, [hidden_state]*32, input_tensor)

Initialize internal Module state, shared by both nn.Module and ScriptModule.

batch_score(ys: Tensor, states: Tensor, xs: Tensor) → Tuple[Tensor, Tensor]

Score new token batch.

Parameters:
- ys (torch.Tensor) – torch.int64 prefix tokens (n_batch, ylen).
- states (List *[*Any ]) – Scorer states for prefix tokens.
- xs (torch.Tensor) – The encoder feature that generates ys (n_batch, xlen, n_feat).
Returns: Tuple of : batchfied scores for next token with shape of (n_batch, n_vocab) and next state list for ys.
Return type: tuple[torch.Tensor, List[Any]]
Raises:ValueError – If the state format is incorrect or if the model type is unsupported.

############# Examples

>>> model = SequentialRNNLM(vocab_size=1000)
>>> ys = torch.tensor([[1, 2], [3, 4]])  # Example prefix tokens
>>> states = [model.zero_state() for _ in range(2)]  # Initial states
>>> xs = torch.randn(2, 10, 300)  # Example encoder features
>>> logp, new_states = model.batch_score(ys, states, xs)
>>> print(logp.shape)  # Should print torch.Size([2, 1000])

NOTE

This method processes a batch of input tokens and their corresponding states to produce the scores for the next possible tokens.

forward(input: Tensor, hidden: Tensor) → Tuple[Tensor, Tensor]

Perform a forward pass through the RNN layer.

This method computes the forward pass of the RNN language model. It takes the input tensor and the hidden state tensor, processes them through the embedding, RNN, and decoder layers, and returns the decoded output and the updated hidden state.

Parameters:
- input (torch.Tensor) – Input tensor of shape (batch_size, seq_len) containing token indices.
- hidden (torch.Tensor) – Hidden state tensor of shape (num_layers, batch_size, hidden_size).
Returns: A tuple containing: : - torch.Tensor: Decoded output tensor of shape (batch_size, seq_len, vocab_size).
- torch.Tensor: Updated hidden state tensor.
Return type: Tuple[torch.Tensor, torch.Tensor]

############# Examples

>>> model = SequentialRNNLM(vocab_size=1000, unit=650)
>>> input_tensor = torch.randint(0, 1000, (32, 10))  # (batch_size, seq_len)
>>> hidden_state = model.zero_state()  # Initialize hidden state
>>> output, new_hidden = model.forward(input_tensor, hidden_state)
>>> output.shape
torch.Size([32, 10, 1000])  # (batch_size, seq_len, vocab_size)

score(y: Tensor, state: Tensor | Tuple[Tensor, Tensor], x: Tensor) → Tuple[Tensor, Tensor | Tuple[Tensor, Tensor]]

Score new token.

This method computes the log probabilities for the next token based on the provided prefix tokens and the current state of the model.

Parameters:
- y – 1D torch.int64 tensor representing the prefix tokens.
- state – The current scorer state for the prefix tokens, which can either be a tensor or a tuple of tensors (for LSTM).
- x – 2D torch.Tensor representing the encoder features that generate the tokens in y.
Returns:
- torch.float32 tensor containing the scores for the next token (shape: n_vocab).
- The updated state for the prefix tokens, which will be of the same type as the input state.
Return type: Tuple of

############# Examples

>>> model = SequentialRNNLM(vocab_size=1000)
>>> prefix_tokens = torch.tensor([1, 2, 3])  # Example prefix tokens
>>> initial_state = model.zero_state()  # Initialize state
>>> encoder_features = torch.randn(1, 5, 650)  # Example features
>>> scores, new_state = model.score(prefix_tokens, initial_state,
...                                   encoder_features)

zero_state()

Initialize LM state filled with zero values.

This method creates an initial state for the language model (LM) that is filled with zeros. The shape of the state depends on the type of RNN being used. For LSTM networks, the state consists of two tensors representing the hidden state and the cell state. For other RNN types, it returns only the hidden state.

Returns: A tuple containing the hidden and cell states for LSTM, or a single tensor for other RNN types. The shape of the returned tensor(s) is determined by the number of layers (nlayers) and the number of hidden units (nhid).
Return type: Union[Tuple[torch.Tensor, torch.Tensor], torch.Tensor]

############# Examples

>>> model = SequentialRNNLM(vocab_size=10000, unit=650, nlayers=2)
>>> initial_state = model.zero_state()
>>> initial_state
(tensor([[0., 0., ..., 0.], [0., 0., ..., 0.]]),
 tensor([[0., 0., ..., 0.], [0., 0., ..., 0.]]))  # for LSTM

>>> model = SequentialRNNLM(vocab_size=10000, unit=650, nlayers=1,
...                          rnn_type='RNN_TANH')
>>> initial_state = model.zero_state()
>>> initial_state
tensor([[0., 0., ..., 0.]])  # for RNN