espnet2.layers.houlsby_adapter_layer.Houlsby_Adapter

About 1 min

espnet2.layers.houlsby_adapter_layer.Houlsby_Adapter

class espnet2.layers.houlsby_adapter_layer.Houlsby_Adapter(input_size: int, bottleneck: int)

Bases: Module

Implements the Houlsby Adapter mechanism for model adaptation.

The Houlsby Adapter is a lightweight module that allows for efficient parameterization of the model by adding a bottleneck layer between the input and output layers. It can be utilized to adapt pre-trained models to specific tasks without requiring full retraining.

bottleneck

The size of the bottleneck layer.

Type: int

houlsby_adapter

The sequential model containing the linear layers and activation function.

Type: nn.Sequential
Parameters:
- input_size (int) – The size of the input features.
- bottleneck (int) – The size of the bottleneck layer.
Returns: The output of the adapter, which has the same size as the input.
Return type: torch.Tensor

####### Examples

>>> adapter = Houlsby_Adapter(input_size=768, bottleneck=32)
>>> input_tensor = torch.randn(10, 768)  # Batch size of 10
>>> output_tensor = adapter(input_tensor)
>>> output_tensor.shape
torch.Size([10, 768])

Raises:ValueError – If input_size or bottleneck is not a positive integer.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Applies the Houlsby Adapter to the input tensor.

This method processes the input tensor x through the Houlsby Adapter, which consists of a linear layer followed by a GELU activation and another linear layer. The purpose of the adapter is to reduce the dimensionality of the input before passing it through the final layer, enabling efficient parameter usage in the model.

Parameters:x (torch.Tensor) – The input tensor to be processed. The expected shape is (batch_size, input_size).
Returns: The output tensor after applying the Houlsby Adapter. The output shape will be the same as the input shape (batch_size, input_size).
Return type: torch.Tensor

####### Examples

>>> adapter = Houlsby_Adapter(input_size=768, bottleneck=32)
>>> input_tensor = torch.randn(10, 768)  # Batch of 10 samples
>>> output_tensor = adapter(input_tensor)
>>> output_tensor.shape
torch.Size([10, 768])