espnet2.enh.loss.criterions.tf_domain.FrequencyDomainLoss
espnet2.enh.loss.criterions.tf_domain.FrequencyDomainLoss
class espnet2.enh.loss.criterions.tf_domain.FrequencyDomainLoss(name, only_for_test=False, is_noise_loss=False, is_dereverb_loss=False)
Bases: AbsEnhLoss
, ABC
Base class for all frequency-domain enhancement loss modules.
This abstract class provides a structure for defining various types of frequency-domain loss functions used in audio enhancement tasks. Derived classes must implement specific loss computations and define the mask type used for those computations.
compute_on_mask
Indicates whether the loss is computed on the mask or the spectrum.
- Type: bool
mask_type
The type of mask used in loss computation.
- Type: str
name
The name of the loss function.
- Type: str
only_for_test
If True, this loss is only used during testing.
- Type: bool
is_noise_loss
If True, this loss is related to noise.
- Type: bool
is_dereverb_loss
If True, this loss is related to dereverberation.
Type: bool
Parameters:
- name (str) – The name of the loss function.
- only_for_test (bool , optional) – Whether the loss is only for testing. Defaults to False.
- is_noise_loss (bool , optional) – Whether the loss is related to noise. Defaults to False.
- is_dereverb_loss (bool , optional) – Whether the loss is related to dereverberation. Defaults to False.
Raises:ValueError – If both is_noise_loss and is_dereverb_loss are True.
####### Examples
>>> loss = FrequencyDomainLoss(name="MyLoss", is_noise_loss=True)
>>> print(loss.name)
MyLoss
Initialize internal Module state, shared by both nn.Module and ScriptModule.
abstract property compute_on_mask : bool
create_mask_label(mix_spec, ref_spec, noise_spec=None)
Create a mask label based on the provided spectrograms.
This method generates a mask label for the input mixed spectrogram and reference spectrograms based on the specified mask type. It utilizes the _create_mask_label helper function to compute the mask.
- Parameters:
- mix_spec (ComplexTensor) – The mixed spectrogram of shape (B, T, [C,] F).
- ref_spec (List *[*ComplexTensor ]) – A list of reference spectrograms of shape (B, T, [C,] F) for each reference signal.
- noise_spec (ComplexTensor , optional) – The noise spectrogram of shape (B, T, [C,] F). This is only used for IBM and IRM masks. Defaults to None.
- Returns: A list of masks of shape : (B, T, [C,] F) or (B, T, F) depending on the mask type.
- Return type: List[Tensor] or List[ComplexTensor]
####### Examples
>>> mix = torch.randn(4, 256, 1, 512) # Mixed spectrogram
>>> ref = [torch.randn(4, 256, 1, 512) for _ in range(2)] # Two refs
>>> masks = create_mask_label(mix, ref)
>>> len(masks) # Should return 2 (one for each reference)
NOTE
Ensure that the mask_type attribute is set correctly in the class to use the desired masking method.
- Raises:AssertionError – If the mask_type is not supported.
property is_dereverb_loss : bool
property is_noise_loss : bool
abstract property mask_type : str
property name : str
property only_for_test : bool