espnet2.spk.projector.xvector_projector.XvectorProjector

About 2 min

espnet2.spk.projector.xvector_projector.XvectorProjector

class espnet2.spk.projector.xvector_projector.XvectorProjector(input_size, output_size)

Bases: AbsProjector

XvectorProjector is a neural network projector that transforms input vectors

into a specified output size using fully connected layers.

This class inherits from the AbsProjector base class and utilizes two fully connected layers with a ReLU activation function in between. The projector is primarily designed for use in speaker embedding tasks.

_output_size

The size of the output vectors after projection.

Type: int

fc1

The first fully connected layer.

Type: torch.nn.Linear

fc2

The second fully connected layer.

Type: torch.nn.Linear

act

The activation function used between layers.

Type: torch.nn.ReLU
Parameters:
- input_size (int) – The size of the input vectors.
- output_size (int) – The desired size of the output vectors.
Returns: The transformed output vector after applying the fully connected layers and activation function.
Return type: torch.Tensor

######### Examples

>>> projector = XvectorProjector(input_size=512, output_size=256)
>>> input_vector = torch.randn(1, 512)
>>> output_vector = projector.forward(input_vector)
>>> output_vector.shape
torch.Size([1, 256])

####### NOTE Ensure that the input size matches the expected input dimensions for the first fully connected layer.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Forward pass for the XvectorProjector.

This method takes an input tensor and processes it through two fully connected layers with a ReLU activation in between. The first layer maps the input to the output size, and the second layer further transforms the output of the first layer.

Parameters:x (torch.Tensor) – Input tensor of shape (batch_size, input_size).
Returns: Output tensor of shape (batch_size, output_size) after applying the two linear transformations and the activation.
Return type: torch.Tensor

######### Examples

>>> projector = XvectorProjector(input_size=128, output_size=64)
>>> input_tensor = torch.randn(32, 128)  # Batch size of 32
>>> output_tensor = projector.forward(input_tensor)
>>> output_tensor.shape
torch.Size([32, 64])

####### NOTE Ensure that the input tensor has the correct shape matching the input_size specified during the initialization of the projector.

output_size()

Returns the output size of the XvectorProjector.

This property provides the size of the output tensor after the forward pass through the projector. The output size is determined during the initialization of the XvectorProjector instance.

output_size

The size of the output tensor after the forward pass.

Type: int
Returns: The output size specified during initialization.
Return type: int

######### Examples

projector = XvectorProjector(input_size=128, output_size=64) print(projector.output_size()) # Output: 64

####### NOTE This property is read-only and is set during the initialization of the projector.