espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.LogSoftmaxGradModification

About 2 min

espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.LogSoftmaxGradModification

class espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.LogSoftmaxGradModification(*args, **kwargs)

Bases: Function

Custom autograd function for applying log softmax gradient modifications.

This class implements the forward and backward methods to compute the log softmax of the input tensor while allowing for clamping of the gradients during backpropagation. It is particularly useful in scenarios where numerical stability is a concern.

clamp

A float value that defines the range for clamping gradients. Should be a non-negative float.

Parameters:
- acts – A tensor of activation values to which the log softmax is applied.
- clamp – A non-negative float used to clamp the gradient during backpropagation.
Returns: A tensor containing the computed log softmax values.
Return type: res
Yields: None
Raises:ValueError – If clamp is less than 0.0.

######### Examples

>>> import torch
>>> from your_module import LogSoftmaxGradModification
>>> acts = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
>>> clamp_value = 0.5
>>> log_softmax = LogSoftmaxGradModification.apply(acts, clamp_value)
>>> log_softmax.backward(torch.ones_like(log_softmax))
>>> print(acts.grad)  # Outputs the clamped gradient

####### NOTE This class should be used as a part of a neural network model where log softmax activation and its gradient are needed.

static backward(ctx, grad_output)

Computes the forward pass of the LogSoftmax with gradient clamping.

This function takes in activation values and clamps them based on the provided threshold. It is typically used in the context of neural network training where the softmax function is followed by a loss function.

Parameters:
- ctx – Context object that can be used to store information for the backward pass.
- acts (torch.Tensor) – The input activation values, typically the output logits from the previous layer.
- clamp (float) – The maximum value to which gradients will be clamped. Must be a non-negative float.
Returns: The same activation values, with no modification.
Return type: torch.Tensor
Raises:ValueError – If clamp is less than 0.0.

######### Examples

>>> import torch
>>> acts = torch.tensor([[0.5, 1.0], [1.5, 2.0]], requires_grad=True)
>>> clamp_value = 0.1
>>> output = LogSoftmaxGradModification.apply(acts, clamp_value)
>>> print(output)
tensor([[0.5, 1.0], [1.5, 2.0]], grad_fn=&lt;LogSoftmaxGradModificationBackward&gt;)

####### NOTE The ctx object is used to store the clamp value for use in the backward pass. The activation values are returned unchanged in the forward pass.

static forward(ctx, acts, clamp)

Compute the forward pass for the LogSoftmax gradient modification.

This method applies the log-softmax operation on the input activations and clamps the output values to prevent overflow or underflow during backpropagation. The clamping value is defined by the clamp parameter.

Parameters:
- ctx – Context object that can be used to store information for the backward pass.
- acts (torch.Tensor) – Input tensor containing activation values.
- clamp (float) – A non-negative float that specifies the clamping range for the output. If clamp is less than 0, a ValueError will be raised.
Returns: The output tensor after applying the log-softmax operation and clamping.
Return type: torch.Tensor
Raises:ValueError – If clamp is less than 0.

######### Examples

>>> acts = torch.tensor([[1.0, 2.0, 3.0]])
>>> clamp_value = 1.0
>>> output = LogSoftmaxGradModification.forward(None, acts, clamp_value)
>>> print(output)
tensor([[0.0000, 0.6931, 1.0986]])

####### NOTE The clamping is performed to avoid numerical issues during gradient computation.