espnet2.speechlm.core_lm.abs_core_lm.SpeechLMInferenceOptions
espnet2.speechlm.core_lm.abs_core_lm.SpeechLMInferenceOptions
class espnet2.speechlm.core_lm.abs_core_lm.SpeechLMInferenceOptions(device: str = 'cpu', search_algo: str = 'sampling', nbest: int = 1, sampling_temperature: float = 1.0, top_k: int = 20, maxlenratio: float = 0.0, minlenratio: float = 0.0, eos: int = 5, start: int = 1, masks: Tensor | None = None, nq: int | None = None)
Bases: object
Options for inference in Speech Language Models.
This class holds various parameters that control the behavior of the inference process for Speech Language Models. Users can adjust these parameters to fine-tune the model’s output according to their needs.
device
The device to run the model on. Default is “cpu”.
- Type: str
search_algo
The algorithm used for searching the next token. Default is “sampling”.
- Type: str
nbest
The number of best candidates to consider. Default is 1.
- Type: int
sampling_temperature
The temperature parameter for sampling. Higher values lead to more randomness. Default is 1.0.
- Type: float
top_k
The number of top candidates to sample from. Default is 20.
- Type: int
maxlenratio
The maximum length ratio for the generated output compared to the input. Default is 0.0.
- Type: float
minlenratio
The minimum length ratio for the generated output compared to the input. Default is 0.0.
- Type: float
eos
The end-of-sequence token ID. Default is 5.
- Type: int
start
The start token ID. Default is 1.
- Type: int
masks
Optional masks for the input sequences. Default is None.
- Type: torch.Tensor
nq
Number of queries for the model. Default is None.
- Type: int
Examples
>>> options = SpeechLMInferenceOptions(
... device="cuda",
... search_algo="beam_search",
... nbest=5,
... sampling_temperature=0.8
... )
>>> print(options.device)
"cuda"
device
eos
masks
maxlenratio
minlenratio
nbest
nq
sampling_temperature
search_algo
start
top_k