espnet2.speechlm.definitions.Modality
Less than 1 minute
espnet2.speechlm.definitions.Modality
class espnet2.speechlm.definitions.Modality(discrete: bool = (True,), data_type: str = ('kaldi_ark',))
Bases: object
A data class that defines the modality of data used in SpeechLM tasks.
discrete
Indicates whether the modality is discrete or continuous. Defaults to True, meaning it is discrete.
- Type: bool
data_type
Specifies how the original data file is loaded. Defaults to “kaldi_ark”.
- Type: str
NOTE
- The discrete attribute determines if a placeholder should be adopted in the spliced sequence during preprocessing.
- The data_type follows the definitions in espnet2.train.dataset.
- For discrete modalities, a modality-specific vocabulary is typically required, with the exception of “spk”.
Examples
>>> codec_modality = Modality()
>>> print(codec_modality.discrete) # Output: True
>>> text_bpe_modality = Modality(data_type="text")
>>> print(text_bpe_modality.data_type) # Output: "text"
data_type
discrete