Interface TextTokenizer

All Known Implementing Classes:
TokenizerByChar, TokenizerByText, TokenizerByWord

public interface TextTokenizer
An interface for text tokenizers.

Text tokenisers are used to return a list of TextEvent from a piece of text.

Version:
3 February 2005
  • Method Details

    • tokenize

      List<TextEvent> tokenize(CharSequence seq)
      Returns the list of TextEvent corresponding to the specified character sequence.
      Parameters:
      seq - the character sequence to tokenize.
      Returns:
      the corresponding list.
    • granurality

      TextGranularity granurality()
      Returns the text granularity of this tokenizer.
      Returns:
      the text granularity of this tokenizer.