espnet2.legacy.nets.batch_beam_search.BatchBeamSearch
espnet2.legacy.nets.batch_beam_search.BatchBeamSearch
class espnet2.legacy.nets.batch_beam_search.BatchBeamSearch(scorers: Dict[str, ScorerInterface], weights: Dict[str, float], beam_size: int, vocab_size: int, sos: int, eos: int, token_list: List[str] = None, pre_beam_ratio: float = 1.5, pre_beam_score_key: str = None, return_hs: bool = False, hyp_primer: List[int] = None, normalize_length: bool = False)
Bases: BeamSearch
Batch beam search implementation.
Initialize beam search.
- Parameters:
- scorers (dict *[*str , ScorerInterface ]) – Dict of decoder modules e.g., Decoder, CTCPrefixScorer, LM The scorer will be ignored if it is
None - weights (dict *[*str , float ]) – Dict of weights for each scorers The scorer will be ignored if its weight is 0
- beam_size (int) – The number of hypotheses kept during search
- vocab_size (int) – The number of vocabulary
- sos (int) – Start of sequence id
- eos (int) – End of sequence id
- token_list (list *[*str ]) – List of tokens for debug log
- pre_beam_score_key (str) – key of scores to perform pre-beam search
- pre_beam_ratio (float) – beam size in the pre-beam search will be
int(pre_beam_ratio * beam_size) - return_hs (bool) – Whether to return hidden intermediates
- normalize_length (bool) – If true, select the best ended hypotheses based on length-normalized scores rather than the accumulated scores
- scorers (dict *[*str , ScorerInterface ]) – Dict of decoder modules e.g., Decoder, CTCPrefixScorer, LM The scorer will be ignored if it is
batch_beam(weighted_scores: Tensor, ids: Tensor) → Tuple[Tensor, Tensor, Tensor, Tensor]
Batch-compute topk full token ids and partial token ids.
- Parameters:
- weighted_scores (torch.Tensor) – The weighted sum scores for each tokens. Its shape is
(n_beam, self.vocab_size). - ids (torch.Tensor) – The partial token ids to compute topk. Its shape is
(n_beam, self.pre_beam_size).
- weighted_scores (torch.Tensor) – The weighted sum scores for each tokens. Its shape is
- Returns: The topk full (prev_hyp, new_token) ids and partial (prev_hyp, new_token) ids. Their shapes are all
(self.beam_size,) - Return type: Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]
batchfy(hyps: List[Hypothesis]) → BatchHypothesis
Convert list to batch.
init_hyp(x: Tensor) → BatchHypothesis
Get an initial hypothesis data.
- Parameters:x (torch.Tensor) – The encoder output feature
- Returns: The initial hypothesis.
- Return type:Hypothesis
merge_states(states: Any, part_states: Any, part_idx: int) → Any
Merge states for new hypothesis.
- Parameters:
- states – states of
self.full_scorers - part_states – states of
self.part_scorers - part_idx (int) – The new token id for
part_scores
- states – states of
- Returns: The new score dict. : Its keys are names of
self.full_scorersandself.part_scorers. Its values are states of the scorers. - Return type: Dict[str, torch.Tensor]
post_process(i: int, maxlen: int, minlen: int, maxlenratio: float, running_hyps: BatchHypothesis, ended_hyps: List[Hypothesis]) → BatchHypothesis
Perform post-processing of beam search iterations.
- Parameters:
- i (int) – The length of hypothesis tokens.
- maxlen (int) – The maximum length of tokens in beam search.
- maxlenratio (int) – The maximum length ratio in beam search.
- running_hyps (BatchHypothesis) – The running hypotheses in beam search.
- ended_hyps (List [Hypothesis ]) – The ended hypotheses in beam search.
- Returns: The new running hypotheses.
- Return type:BatchHypothesis
score_full(hyp: BatchHypothesis, x: Tensor, pre_x: Tensor = None) → Tuple[Dict[str, Tensor], Dict[str, Any]]
Score new hypothesis by self.full_scorers.
- Parameters:
- hyp (Hypothesis) – Hypothesis with prefix tokens to score
- x (torch.Tensor) – Corresponding input feature
- pre_x (torch.Tensor) – Encoded speech feature for sequential attn (T, D) Sequential attn computes attn first on pre_x then on x, thereby attending to two sources in sequence.
- Returns: Tuple of : score dict of
hypthat has string keys ofself.full_scorersand tensor score values of shape:(self.n_vocab,), and state dict that has string keys and state values ofself.full_scorers - Return type: Tuple[Dict[str, torch.Tensor], Dict[str, Any]]
score_partial(hyp: BatchHypothesis, ids: Tensor, x: Tensor, pre_x: Tensor = None) → Tuple[Dict[str, Tensor], Dict[str, Any]]
Score new hypothesis by self.full_scorers.
- Parameters:
- hyp (Hypothesis) – Hypothesis with prefix tokens to score
- ids (torch.Tensor) – 2D tensor of new partial tokens to score
- x (torch.Tensor) – Corresponding input feature
- pre_x (torch.Tensor) – Encoded speech feature for sequential attn (T, D) Sequential attn computes attn first on pre_x then on x, thereby attending to two sources in sequence.
- Returns: Tuple of : score dict of
hypthat has string keys ofself.full_scorersand tensor score values of shape:(self.n_vocab,), and state dict that has string keys and state values ofself.full_scorers - Return type: Tuple[Dict[str, torch.Tensor], Dict[str, Any]]
search(running_hyps: BatchHypothesis, x: Tensor, pre_x: Tensor = None) → BatchHypothesis
Search new tokens for running hypotheses and encoded speech x.
- Parameters:
- running_hyps (BatchHypothesis) – Running hypotheses on beam
- x (torch.Tensor) – Encoded speech feature (T, D)
- pre_x (torch.Tensor) – Encoded speech feature for sequential attention (T, D)
- Returns: Best sorted hypotheses
- Return type:BatchHypothesis
unbatchfy(batch_hyps: BatchHypothesis) → List[Hypothesis]
Revert batch to list.
