espnet2.text.phoneme_tokenizer.G2pk

Less than 1 minute

espnet2.text.phoneme_tokenizer.G2pk

class espnet2.text.phoneme_tokenizer.G2pk(descritive=False, group_vowels=False, to_syl=False, no_space=False, explicit_space=False, space_symbol='<space>')

Bases: object

On behalf of g2pk.G2p.

g2pk.G2p isn’t picklable and it can’t be copied to other processes via the multiprocessing module. As a workaround, g2pk.G2p is instantiated upon calling this class.

descritive

If True, produces descriptive phonemes.

Type: bool

group_vowels

If True, groups similar vowel sounds.

Type: bool

to_syl

If True, converts output to syllables.

Type: bool

no_space

If True, removes spaces that represent word separators.

Type: bool

explicit_space

If True, replaces spaces with a specified symbol.

Type: bool

space_symbol

The symbol used to represent spaces.

Type: str
Parameters:
- descritive (bool) – Optional; defaults to False.
- group_vowels (bool) – Optional; defaults to False.
- to_syl (bool) – Optional; defaults to False.
- no_space (bool) – Optional; defaults to False.
- explicit_space (bool) – Optional; defaults to False.
- space_symbol (str) – Optional; defaults to “<space>”.
Returns: A list of phonemes generated from the input text.
Return type: List[str]

Examples

>>> from espnet2.text.phoneme_tokenizer import G2pk
>>> g2p = G2pk()
>>> g2p("hello world")
['h', 'e', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']

>>> g2p_no_space = G2pk(no_space=True)
>>> g2p_no_space("hello world")
['h', 'e', 'l', 'o', 'w', 'o', 'r', 'l', 'd']

>>> g2p_explicit_space = G2pk(explicit_space=True, space_symbol="_")
>>> g2p_explicit_space("hello world")
['h', 'e', 'l', 'o', '_', 'w', 'o', 'r', 'l', 'd']