espnet2.spk.loss.aamsoftmax.AAMSoftmax

About 2 min

espnet2.spk.loss.aamsoftmax.AAMSoftmax

class espnet2.spk.loss.aamsoftmax.AAMSoftmax(nout, nclasses, margin=0.3, scale=15, easy_margin=False, **kwargs)

Bases: AbsLoss

Additive Angular Margin Softmax (AAMSoftmax) for deep face recognition.

This class implements the AAMSoftmax loss function, which incorporates an additive angular margin to enhance the discriminative power of deep face recognition systems. The concept is based on the paper by Deng et al. (2019) entitled “ArcFace: Additive Angular Margin Loss for Deep Face Recognition.”

test_normalize

Indicates whether to normalize during testing.

Type: bool

Margin value for AAMSoftmax.

Type: float

Scale value for AAMSoftmax.

Type: float

in_feats

Dimensionality of speaker embedding.

Type: int

weight

Learnable parameter for class weights.

Type: torch.nn.Parameter

Cross-entropy loss function.

Type: nn.CrossEntropyLoss

easy_margin

Flag to use easy margin or standard margin.

Type: bool

cos_m

Cosine of the margin.

Type: float

in_m

Sine of the margin.

Type: float

Threshold for cosine values.

Type: float

Modified margin value for non-monotonic cases.

Type: float
Parameters:
- nout (int) – Dimensionality of speaker embedding.
- nclasses (int) – Number of speakers in the training set.
- margin (float , optional) – Margin value of AAMSoftmax (default: 0.3).
- scale (float , optional) – Scale value of AAMSoftmax (default: 15).
- easy_margin (bool , optional) – If True, use easy margin (default: False).
Returns: The computed loss value.
Return type: torch.Tensor

####### Examples

>>> aamsoftmax = AAMSoftmax(nout=512, nclasses=10)
>>> x = torch.randn(32, 512)  # Batch of 32 embeddings
>>> labels = torch.randint(0, 10, (32,))  # Random labels for 10 classes
>>> loss = aamsoftmax(x, labels)
>>> print(loss)

NOTE

This implementation is adapted from the original code from the VoxCeleb trainer and Face_Pytorch repository.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x, label=None)

Computes the forward pass of the Additive Angular Margin Softmax (AAMSoftmax).

This method calculates the loss based on the input features and their corresponding labels. It applies the AAMSoftmax transformation to the input embeddings, incorporating the additive angular margin for improved discriminative power in face recognition tasks.

Parameters:
- x (torch.Tensor) – Input features of shape (batch_size, nout), where nout is the dimensionality of the speaker embedding.
- label (torch.Tensor , optional) – Ground truth labels of shape (batch_size,) or (batch_size, 1). Defaults to None.
Returns: The computed loss value.
Return type: torch.Tensor
Raises:
- AssertionError – If the size of label does not match the first
- dimension of x or if x does not have the expected number of –
- features. –

####### Examples

>>> aamsoftmax = AAMSoftmax(nout=512, nclasses=10)
>>> features = torch.randn(32, 512)  # Batch of 32 samples
>>> labels = torch.randint(0, 10, (32,))  # Random labels for 10 classes
>>> loss = aamsoftmax(features, labels)
>>> print(loss)

NOTE

The method normalizes the input features and weights before calculating cosine similarities. The label tensor should contain class indices corresponding to the input features.