espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p_accent_with_pause

Less than 1 minute

espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p_accent_with_pause

espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p_accent_with_pause(text) → List[str]

Convert text to a sequence of phonemes with accent and pause information.

This function processes the input text to extract phonemes while considering accent information and pauses. It identifies pauses in the input and represents them as ‘pau’ in the output list. The function utilizes full-context labels extracted from the input text to determine phoneme attributes.

Parameters:text (str) – Input text to be converted into phonemes.
Returns: A list of phonemes including accents and pauses.
Return type: List[str]

Examples

>>> result = pyopenjtalk_g2p_accent_with_pause("こんにちは")
>>> print(result)
['k', 'o', 'N', 'n', 'i', 'ch', 'i']

>>> result_with_pause = pyopenjtalk_g2p_accent_with_pause("こんにちは。")
>>> print(result_with_pause)
['k', 'o', 'N', 'n', 'i', 'ch', 'i', 'pau']

NOTE

The function relies on the _extract_fullcontext_label function to get full-context labels from the input text. It requires the pyopenjtalk package to be installed.