espnet2.spk.loss.aamsoftmax.AAMSoftmax
espnet2.spk.loss.aamsoftmax.AAMSoftmax
class espnet2.spk.loss.aamsoftmax.AAMSoftmax(nout, nclasses, margin=0.3, scale=15, easy_margin=False, **kwargs)
Bases: AbsLoss
Additive Angular Margin Softmax (AAMSoftmax) for deep face recognition.
This class implements the AAMSoftmax loss function, which incorporates an additive angular margin to enhance the discriminative power of deep face recognition systems. The concept is based on the paper by Deng et al. (2019) entitled “ArcFace: Additive Angular Margin Loss for Deep Face Recognition.”
test_normalize
Indicates whether to normalize during testing.
- Type: bool
m
Margin value for AAMSoftmax.
- Type: float
s
Scale value for AAMSoftmax.
- Type: float
in_feats
Dimensionality of speaker embedding.
- Type: int
weight
Learnable parameter for class weights.
- Type: torch.nn.Parameter
ce
Cross-entropy loss function.
- Type: nn.CrossEntropyLoss
easy_margin
Flag to use easy margin or standard margin.
- Type: bool
cos_m
Cosine of the margin.
- Type: float
s
Sine of the margin.
- Type: float
th
Threshold for cosine values.
- Type: float
m
Modified margin value for non-monotonic cases.
Type: float
Parameters:
- nout (int) – Dimensionality of speaker embedding.
- nclasses (int) – Number of speakers in the training set.
- margin (float , optional) – Margin value of AAMSoftmax (default: 0.3).
- scale (float , optional) – Scale value of AAMSoftmax (default: 15).
- easy_margin (bool , optional) – If True, use easy margin (default: False).
Returns: The computed loss value.
Return type: torch.Tensor
####### Examples
>>> aamsoftmax = AAMSoftmax(nout=512, nclasses=10)
>>> x = torch.randn(32, 512) # Batch of 32 embeddings
>>> labels = torch.randint(0, 10, (32,)) # Random labels for 10 classes
>>> loss = aamsoftmax(x, labels)
>>> print(loss)
NOTE
This implementation is adapted from the original code from the VoxCeleb trainer and Face_Pytorch repository.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
forward(x, label=None)
Computes the forward pass of the Additive Angular Margin Softmax (AAMSoftmax).
This method calculates the loss based on the input features and their corresponding labels. It applies the AAMSoftmax transformation to the input embeddings, incorporating the additive angular margin for improved discriminative power in face recognition tasks.
- Parameters:
- x (torch.Tensor) – Input features of shape (batch_size, nout), where nout is the dimensionality of the speaker embedding.
- label (torch.Tensor , optional) – Ground truth labels of shape (batch_size,) or (batch_size, 1). Defaults to None.
- Returns: The computed loss value.
- Return type: torch.Tensor
- Raises:
- AssertionError – If the size of label does not match the first
- dimension of x or if x does not have the expected number of –
- features. –
####### Examples
>>> aamsoftmax = AAMSoftmax(nout=512, nclasses=10)
>>> features = torch.randn(32, 512) # Batch of 32 samples
>>> labels = torch.randint(0, 10, (32,)) # Random labels for 10 classes
>>> loss = aamsoftmax(features, labels)
>>> print(loss)
NOTE
The method normalizes the input features and weights before calculating cosine similarities. The label tensor should contain class indices corresponding to the input features.