espnet2.spk.projector.rawnet3_projector.RawNet3Projector

About 1 min

espnet2.spk.projector.rawnet3_projector.RawNet3Projector

class espnet2.spk.projector.rawnet3_projector.RawNet3Projector(input_size, output_size=192)

Bases: AbsProjector

RawNet3Projector is a neural network projector that applies batch normalization

followed by a linear transformation to the input data. This class is a part of the ESPnet2 speaker projection module and inherits from the AbsProjector class.

_output_size

The size of the output features after projection.

Type: int

Batch normalization layer for input features.

Type: torch.nn.BatchNorm1d

Linear transformation layer.

Type: torch.nn.Linear
Parameters:
- input_size (int) – The number of input features.
- output_size (int , optional) – The number of output features. Defaults to 192.
Returns: The projected output features after applying batch normalization and linear transformation.
Return type: torch.Tensor

######### Examples

>>> projector = RawNet3Projector(input_size=256, output_size=128)
>>> input_tensor = torch.randn(10, 256)  # Batch size of 10
>>> output_tensor = projector.forward(input_tensor)
>>> print(output_tensor.shape)
torch.Size([10, 128])

####### NOTE This projector is designed for use in speech processing tasks and is part of the ESPnet2 toolkit.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Applies a forward pass through the RawNet3Projector model.

This method takes an input tensor, applies batch normalization, and then passes the normalized output through a fully connected linear layer.

Parameters:x (torch.Tensor) – Input tensor of shape (batch_size, input_size).
Returns: Output tensor of shape (batch_size, output_size) after applying batch normalization and linear transformation.
Return type: torch.Tensor

######### Examples

>>> projector = RawNet3Projector(input_size=256, output_size=192)
>>> input_tensor = torch.randn(10, 256)  # Batch of 10 samples
>>> output_tensor = projector.forward(input_tensor)
>>> print(output_tensor.shape)
torch.Size([10, 192])

####### NOTE The input tensor must have the same size as the specified input_size during the initialization of the projector.

output_size()

Returns the output size of the projector.

The output size is defined during the initialization of the RawNet3Projector instance and is typically used to determine the dimensions of the output tensor produced by the forward method.

_output_size

The size of the output layer, which defaults

Type: int

to 192 if not specified during initialization.

Parameters:None
Returns: The output size of the projector.
Return type: int

######### Examples

projector = RawNet3Projector(input_size=256, output_size=128) print(projector.output_size()) # Output: 128

####### NOTE This method is a property that allows access to the output size without modifying it.