espnet2.text.phoneme_tokenizer.IsG2p

Less than 1 minute

espnet2.text.phoneme_tokenizer.IsG2p

class espnet2.text.phoneme_tokenizer.IsG2p(dialect: str = 'standard', syllabify: bool = True, word_sep: str = ',', use_dict: bool = True)

Bases: object

Minimal wrapper for https://github.com/grammatek/ice-g2p

The g2p module uses a Bi-LSTM model along with a pronunciation dictionary to generate phonemization. Unfortunately, it does not support multi-thread phonemization as of yet.

dialect

The dialect to use for phonemization (default: “standard”).

Type: str

syllabify

Whether to syllabify the output (default: True).

Type: bool

word_sep

The separator for words (default: “,”).

Type: str

use_dict

Whether to use a pronunciation dictionary (default: True).

Type: bool
Parameters:
- dialect (str) – The dialect for phonemization.
- syllabify (bool) – Flag to enable syllabification.
- word_sep (str) – Separator used for words.
- use_dict (bool) – Flag to enable dictionary usage.
Returns: A list of phonemes generated from the input text.
Return type: List[str]

Examples

>>> g2p = IsG2p()
>>> phonemes = g2p("example text")
>>> print(phonemes)
['ɪ', 'g', 'z', 'æ', 'm', 'p', 'əl', 't', 'ɛ', 'k', 's']