espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_gpu
About 1 min
espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_gpu
espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_gpu(acts: Tensor, labels: Tensor, input_lengths: Tensor, label_lengths: Tensor, costs: Tensor, grads: Tensor, blank_label: int, fastemit_lambda: float, clamp: float, num_threads: int)
Wrapper method for accessing GPU RNNT loss.
This function computes the Recurrent Neural Network Transducer (RNNT) loss using GPU acceleration. The CUDA implementation is ported from HawkAaron/warp-transducer.
- Parameters:
- acts – Activation tensor of shape [B, T, U, V+1], where B is the batch size, T is the time dimension, U is the target length, and V is the number of classes excluding the blank label.
- labels – Ground truth labels of shape [B, U].
- input_lengths – Lengths of the acoustic sequence as a vector of integers of shape [B].
- label_lengths – Lengths of the target sequence as a vector of integers of shape [B].
- costs – Zero vector of length [B] where costs will be set.
- grads – Zero tensor of shape [B, T, U, V+1] where the gradient will be set.
- blank_label – Index of the blank token in the vocabulary.
- fastemit_lambda – Float scaling factor for FastEmit regularization. Refer to the FastEmit paper for more details.
- clamp – Float value. When set to value >= 0.0, it clamps the gradient to [-clamp, clamp].
- num_threads – Number of threads for OpenMP. If negative, it will use the number of available CPU cores.
- Returns: Returns True if the operation was successful.
- Return type: bool
- Raises:RuntimeError – If there is an invalid parameter passed when calculating workspace memory or if forward scores cannot be calculated.
Examples
>>> acts = torch.randn(16, 50, 30, 20).cuda() # Example activations
>>> labels = torch.randint(0, 30, (16, 10)).cuda() # Example labels
>>> input_lengths = torch.randint(1, 51, (16,)).cuda() # Example lengths
>>> label_lengths = torch.randint(1, 11, (16,)).cuda() # Example lengths
>>> costs = torch.zeros(16).cuda() # Costs initialization
>>> grads = torch.zeros(16, 50, 30, 21).cuda() # Gradients initialization
>>> rnnt_loss_gpu(acts, labels, input_lengths, label_lengths, costs,
... grads, blank_label=29, fastemit_lambda=0.5,
... clamp=0.1, num_threads=4)
NOTE
This function requires the CUDA-enabled version of PyTorch and the appropriate GPU resources.