espnet2.speechlm.definitions.pad_until
Less than 1 minute
espnet2.speechlm.definitions.pad_until
espnet2.speechlm.definitions.pad_until(token_list, until)
Pad a list of tokens until it reaches the specified length.
This function appends unused tokens to the provided token list until the list’s length matches the specified ‘until’ value. Each unused token is named in the format <unused_token_{index}>, where {index} is the current length of the list before padding.
- Parameters:
- token_list (list) – The list of tokens to be padded.
- until (int) – The desired length of the token list after padding.
- Returns: The padded list of tokens.
- Return type: list
- Raises:
- AssertionError – If ‘until’ is not greater than the current length
- of 'token_list'. –
Examples
>>> tokens = ['<pad>', '<unk>']
>>> pad_until(tokens, 5)
['<pad>', '<unk>', '<unused_token_2>', '<unused_token_3>',
'<unused_token_4>']
>>> pad_until([], 3)
['<unused_token_0>', '<unused_token_1>', '<unused_token_2>']
NOTE
The function modifies the original list in place.