espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_gpu

About 1 min

espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_gpu

espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_gpu(acts: Tensor, labels: Tensor, input_lengths: Tensor, label_lengths: Tensor, costs: Tensor, grads: Tensor, blank_label: int, fastemit_lambda: float, clamp: float, num_threads: int)

Wrapper method for accessing GPU RNNT loss.

This function computes the Recurrent Neural Network Transducer (RNNT) loss using GPU acceleration. The CUDA implementation is ported from HawkAaron/warp-transducer.

Parameters:
- acts – Activation tensor of shape [B, T, U, V+1], where B is the batch size, T is the time dimension, U is the target length, and V is the number of classes excluding the blank label.
- labels – Ground truth labels of shape [B, U].
- input_lengths – Lengths of the acoustic sequence as a vector of integers of shape [B].
- label_lengths – Lengths of the target sequence as a vector of integers of shape [B].
- costs – Zero vector of length [B] where costs will be set.
- grads – Zero tensor of shape [B, T, U, V+1] where the gradient will be set.
- blank_label – Index of the blank token in the vocabulary.
- fastemit_lambda – Float scaling factor for FastEmit regularization. Refer to the FastEmit paper for more details.
- clamp – Float value. When set to value >= 0.0, it clamps the gradient to [-clamp, clamp].
- num_threads – Number of threads for OpenMP. If negative, it will use the number of available CPU cores.
Returns: Returns True if the operation was successful.
Return type: bool
Raises:RuntimeError – If there is an invalid parameter passed when calculating workspace memory or if forward scores cannot be calculated.

Examples

>>> acts = torch.randn(16, 50, 30, 20).cuda()  # Example activations
>>> labels = torch.randint(0, 30, (16, 10)).cuda()  # Example labels
>>> input_lengths = torch.randint(1, 51, (16,)).cuda()  # Example lengths
>>> label_lengths = torch.randint(1, 11, (16,)).cuda()  # Example lengths
>>> costs = torch.zeros(16).cuda()  # Costs initialization
>>> grads = torch.zeros(16, 50, 30, 21).cuda()  # Gradients initialization
>>> rnnt_loss_gpu(acts, labels, input_lengths, label_lengths, costs,
...                grads, blank_label=29, fastemit_lambda=0.5,
...                clamp=0.1, num_threads=4)

NOTE

This function requires the CUDA-enabled version of PyTorch and the appropriate GPU resources.