espnet2.text.phoneme_tokenizer.split_by_space

Parameters: text ( str ) – The input text string to be split.
Returns: A list of words extracted from the input text.
Return type: List[str]

Less than 1 minute

espnet2.text.phoneme_tokenizer.split_by_space(text) → List[str]

Splits the input text into a list of words based on spaces.

This function replaces multiple consecutive spaces with a single space, ensuring that the output list contains words separated by a single space.

>>> split_by_space("Hello world")
['Hello', 'world']

>>> split_by_space("This  is  a   test")
['This', 'is', 'a', 'test']

>>> split_by_space("  Leading and trailing spaces  ")
['Leading', 'and', 'trailing', 'spaces']

>>> split_by_space("Multiple   spaces  here")
['Multiple', 'spaces', 'here']